• No results found

Increasing the Predictability of European Stock Returns Using Fundamental Analysis

N/A
N/A
Protected

Academic year: 2021

Share "Increasing the Predictability of European Stock Returns Using Fundamental Analysis "

Copied!
45
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Increasing the Predictability of European Stock Returns Using Fundamental Analysis

A.J. KRUIZE*

Master’s Thesis for the MSc Finance programme of the University of Groningen

Supervisor: Dr. L. Dam

March 13th, 2013

ABSTRACT

This thesis combines the use of valuation ratios such as the dividend yield and the market-to-book ratio with the use of fundamental signals of firm performance as predictors of stock returns. The in-sample predictability of stock returns increases when these signals are added to a base model of predictors. The in- sample predictability also increases when the forecast horizon is increased from one year to five or seven years. These results do not hold out-of-sample and also change when alternative specifications of the forecasting regressions are used.

Keywords: Panel data analysis, stock return predictability, fundamental analysis, out-of-sample forecasts

JEL codes: C33, G11, G17

I. Introduction

This thesis is structured around two central questions: Can fundamental analysis increase the predictability of stock returns? And does the predictability increase when the forecast horizon is increased? These questions are answered separately for the in-sample and out-of-sample forecasts, resulting in two different answers. The in-sample predictability of stock returns increases when

* Student at the University of Groningen, Faculty of Economics and Business, The Netherlands.

E-mail: a.j.kruize@student.rug.nl, student number: s1772260.

I would like to thank my supervisor, Dr. Lammertjan Dam, for his assistance in performing the analysis and for providing me with honest and useful feedback.

(2)

the signals are added to a base model of predictors. The in-sample predictability also increases when the forecast horizon is increased from one year to five or seven years. However, these results do not hold out-of-sample and the alternative specifications of the forecasting regressions also change the outcome.

The predictability of stock returns is the topic of a lively debate for many years already and the conventional wisdom is that the stock prices are random and hence cannot be forecasted. This random walk view became popular by contributions from for example Malkiel (1973) in his book ‘A Random Walk Down Wall Street’. In this book he argues that stock prices do not follow any pattern, resulting in random realisations of stock returns that cannot be forecasted. There is, however, abundant evidence against this random walk view, summarised by Lo and MacKinlay (1999) in their book titled ‘A Non-Random Walk Down Wall Street’. They prove that the trends in stock price movements can be explained by the historical prices and some drift factor. Even though the realisations of the stock returns can be random, the underlying equity risk premium varies over time and can be forecasted, this is also found by Polk, Thompson and Vuolteenaho (2006).

The dividend-price ratio, dividend yield, market-to-book ratio and earnings yield are used extensively to predict either the future stock returns or the equity risk premium. In addition to these valuation ratios there is an abundance of research on non-earnings accounting numbers and their predictive power on future stock returns. The use of accounting data instead of valuation ratios such as the dividend yield is generally referred to as fundamental analysis. Kothari (2001, p. 109) phrases it as follows: “Fundamental analysis entails the use of information in current and past financial statements, in conjunction with industry and macroeconomic data to arrive at a firm’s intrinsic value.”

The majority of the existing literature on stock return predictability focusses on the US and uses data going back to the 1900s. In this thesis however, my focus is on European stocks and the analysis starts in the 1980s because the company data were not available for earlier years. The focus is on Europe given the bias towards the US in the existing literature and a resulting lack on evidence on European stock return predictability. I investigate whether fundamental signals

(3)

constructed from non-earnings accounting information can increase the predictability of stock returns relative to a model including only the dividend yield, the earnings per share and the market-to-book ratio.1 The method employed closely follows earlier work by for example Abarbanell and Bushee (1997 and 1998), Piotroski (2000) and Elleuch (2009) and is explained in detail in Section III. However, there are some distinctions in the applied methodology that are worth mentioning up front. I combine the use of both in-sample and out-of- sample forecasts and base the conclusions on predictability on the construction of two different specifications of the goodness of fit statistic, the within and the overall. Next to the two measures of the goodness of fit statistics, I will also use three different models to forecast; the base model mentioned before, a full model with eleven fundamental signals and also a factor model based on a factor analysis.

The remainder of this thesis is structured as follows: first the most significant contributions to the literature are discussed, followed by a detailed description of the methodology, and after that the data is described. Subsequently the results are discussed followed by the conclusion and limitations. The Appendices provide additional information on the method, data and variables as well as the detailed calculations of the signals and some additional tables with regression results.

II. Literature

Both investors and academic researchers have been searching for numerous years already for methods to identify undervalued stocks that are expected to earn a high return in the future, with Graham and Dodd (1934) being one of the first and most influential studies on this topic. Their book titled ‘Security Analysis’ is particularly interesting since Cowles (1933) concludes that there is no evidence that anyone can beat the stock market and that stock returns cannot be forecasted. This finding is confirmed by Jensen (1968) who uses information on a

1 The market-to-book ratio is referred to as the market-to-book value in the tables and text throughout this thesis to be consistent with the data. See also Appendix B for the sources of the data.

(4)

large sample of mutual funds to show that most of the fund managers fail to generate a so-called positive alpha, meaning that they are unable to beat the market.

Despite this early finding that the stock returns cannot be forecasted, many researchers confirm the predictability of stock returns: either directly by forecasting the stock returns or indirectly by forecasting the equity risk premium.

The three predictors most used for this purpose are the dividend yield, the dividend-price ratio and the earnings yield. Rozeff (1984) concludes that the dividend yield can be used to predict stock returns and Fama and French (1988) confirm this result. Campbell and Shiller (1988a) show that the dividend-price ratio can forecast long horizon returns. Goyal and Welch (2003) also investigate the return predictability of the dividend-price ratio and find that there is some in-sample predictability but this predictability does not exist out-of-sample.

Campbell and Shiller (1988b) confirm the predictability for the earnings yield as well. Besides the preceding predictors, the market-to-book ratio has been proven to be a good predictor of stock returns too (Fama and French, 1992). Pontiff and Schall (1998) and Kothari and Shanken (1997) conclude that the market-to-book ratio is a stronger predictor of future stock returns than the dividend yield or earnings yield.

More recently, the focus of the predictability literature shifted to the distinction between in-sample and out-of-sample predictability. Goyal and Welch (2008) confirm their earlier results and find that most models provide unstable in- sample results and that the models fail to predict out-of-sample. Cochrane (2008) also finds disappointing out-of-sample results, but he adds that these results are to be expected since the statistical power of the out-of-sample tests is weak. As a more general remark, Polk, Thompson and Vuolteenaho (2006) find that the equity risk premium cannot be forecasted at all because its realisations are too noisy.

Based on the literature above the general believe is that even though the stock returns can be forecasted in-sample, this predictability disappears when the forecasts are performed out-of-sample. In a response to the new focus on out-of- sample testing, Inoue and Kilian (2004) show that the focus on out-of-sample

(5)

results is unfair. They conclude that in-sample tests have a higher statistical power in forecasting compared to out-of-sample tests and that this higher power is not caused by data mining or parameter instability. These arguments are also used by Cochrane (2008) when he finds disappointing out-of-sample results.

The second generally accepted view is that the return predictability grows as the forecast horizon grows. Many authors, including Fama and French (1988), Campbell (1991) and Cochrane (1992), conclude that the return predictability grows when the forecast horizon is larger. An exception are Ang and Bekaert (2007) who find that there is indeed short horizon predictability but that these results are not confirmed for a longer horizon. A possible explanation by Boudoukh, Richardson and Whitelaw (2008) is that the increased predictability for longer horizons does not come from an actual improved predictability but from sampling variation.

A second broad range of studies focusses on accounting information to examine the predictability of returns. Ou and Penman (1989b) use the price-earnings ratio in their research, Ou (1990) uses several individual non-earnings accounting numbers to show a link with future earnings. Chan, Hamao and Lakonishok (1991) investigate the link with cash flow yield where Stober (1993) uses receivables to investigate the link with earnings. Kerstein and Kim (1995) focus on capital expenditures, and accruals are the topic of research by Sloan (1996). A well-known paper by Fama and French (1992) focusses on both the size of companies (SMB) as well as the book-to-market ratio (HML). The latter is also repeated in other studies including Lakonishok, Shleifer and Vishny (1994) and Davis (1994). Additionally, there are also several earnings related variables that have been investigated. Studies examining these variables include Basu (1977) focussing on the earnings yield and LaPorta (1996) using forecasted long-term earnings growth.

A final strand of the literature comprises the use of multiple predictors at once instead of focussing on a single predictor. One of the first studies using multiple descriptors of fundamental value was Ou and Penman (1989a), followed by Penman (1992) and Holthausen and Larcker (1992). This started a new branch of research investigating the link between future earnings or returns with several

(6)

combined variables rather than a single variable. The first study to include multiple predictors of stock returns at once was Lev and Thiagarajan (1993).

Their new methodology inspired many other studies including Abarbanell and Bushee (1997 and 1998) and Piotroski (2000). In my research I combine the methods of Piotroski (2000) and a more recent paper by Elleuch (2009) who constructs twelve conceptually appealing fundamental signals of future performance.

III. Method

The methodology in this thesis is partly based on prior research by Elleuch (2009), Piotroski (2000) and Abarbanell and Bushee (1997 and 1998). These papers use fundamental signals calculated from accounting data together with some return on a (market) portfolio. They test whether including the accounting information increases the return on a portfolio of stocks. In this research I forecast the individual cumulative returns of the companies in the EURO STOXX Index.

The return measure is constructed from the total return index ( ) which assumes that any cash dividends are reinvested. The future returns are constructed as the cumulative returns over the entire forecast horizon, where the horizon is either one, five or seven years:

( ) (1)

where the notation of denotes the realized cumulative returns over the next years for company at year .

A. Models

Three different models will be used to forecast the returns in this paper. The first one is the so-called base model and consists of the three variables commonly used as predictors of stock returns. The second model is the full model and adds

(7)

eleven fundamental signals of firm performance to the three base variables. As a third specification the eleven signals are reduced to three different factors using factor analysis. These factors are added to the base model in the factor model.

These three models are used to test whether adding the signals or factors to the base model increases or decreases the goodness of fit of the predictions of the stock returns.

A.1. Base Model and Full Model

The base model consists of the earnings per share ( ), the dividend yield ( ) and the-market-to-book value ( ). These variables are used to forecast the future stock return of the firms in a panel least squares (PLS) regression with cross-section fixed effects. The cross-section fixed effects are used for both theoretical and empirical reasons. For most of the regressions the redundant fixed effects tests conclude that a single intercept is not the model with the best fit, i.e. the cross-section fixed effects are not redundant. The redundancy of the fixed effects is tested with a Hausman likelihood ratio test.

Because of the large number of regressions not all test results are reported for each of the different individual forecasts; however, Table AVIII in Appendix D reports the test results for the full sample period (1982-2011). Besides this empirical evidence, it is also theoretically logical to use a cross-section fixed effects model to forecast future stock returns with panel data. The 298 firms used are all different from each other and also perform differently and the changes in the returns vary greatly between firms. The changes in stock returns over time however follow the market movements to a great extent. The period fixed effects would therefore not be informative since the market movements are generally the same for all firms. However, the cross-section fixed effects are informative on the performance of the individual companies. The base model including these fixed effects is represented by the following equation:

(2)

(8)

where is the intercept capturing the cross-section fixed effects and is an error term. As stated above, the superscript denotes the forecast horizon, either one, five or seven years. The full model is obtained by adding a vector to include the eleven signals calculated from accounting data:2 The subscript denotes the individual coefficients for each of the signals.

(3)

A.2. Factor Analysis and Resulting Factor Model

By construction, the full model will always have a better goodness of fit than the base model, since adding more variables generally increases the in-sample fit of a model. For the in-sample forecasts this bias can be omitted by the use of the adjusted , this option is discussed below. This relationship does not hold out-of- sample since a larger number of variables has the potential to lower the out-of- sample fit.

Another way to omit this bias is to reduce the number of independent variables in the model. When the independent variables are correlated this can be done by creating different factors using factor analysis. The signals are correlated with one another as can be seen in Table AIV in Appendix B. Two different methods for performing a factor analysis are used in this thesis; exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). The difference between the two methods lies in the assumptions on the underlying structure of the covariance between the variables. EFA assumes that there is no predetermined structure and assumes that any underlying factor can contribute to the variance of any variable. In contrast, CFA assumes a predetermined structure in the variables and assumes that every factor is responsible for the variance in a predetermined number of variables. Empirically the EFA is more powerful since it has no restricting assumptions on the underlying structure between the variables, but the CFA has more practical relevance in the sense that it provides an insight in

2 More details on the signals can be found in Section IV and Appendix B provides the detailed calculations.

(9)

the structure and the sources of the variance.

As a starting point for the factor analysis the EFA is used. Table AVI in Appendix C reports the detailed results and the factor loadings, but the general conclusion is that the EFA method results in a single factor with a poor fit and thus concludes that the individual variables should be used instead of a factor model.

In addition the second method is used to test whether the variance of the signals can be traced back to a smaller number of underlying factors, where these factors still have a practical meaning. For the CFA the eleven signals are grouped in three groups of variables related to sales, total assets or productivity (see also Table I). By manually setting the number of factors the factor model loses explanatory power but retains its conceptual significance. Again the detailed results of the analysis are reported in Table AV in Appendix C. As a disappointing conclusion, the CFA also results in three statistically insignificant factors. Despite the fact that both the EFA and the CFA conclude that the factors have a poor fit the factor models are still used in the further analysis. The factors are still used because they add value for the out-of-sample forecasts. The three factors obtained by the CFA are used because they have a conceptually appealing meaning whereas the single factor obtained by the EFA has only a statistical meaning. An important note to make is that since both methods result in insignificant factors, not too much emphasis should be placed on the factor models in the discussion of the results.

The three factors are constructed by multiplying the obtained factor loadings with the signals for the entire sample period, so the factor loadings are assumed to be constant over time. If the factor loadings were calculated individually for each sub-period for the in-sample forecasts, they should also be calculated individually for the out-of-sample forecasts. But what period should be used to calculate the out-of-sample period factor loadings? If the estimation period is used than the factor loadings do not correspond to the period of the forecast. But if the forecast period is used to calculate the factor loadings, than the forecasts contain information from the forecast period and are therefore not proper out-of- sample forecasts anymore. The solution to this problem is to calculate the factor

(10)

loadings for the entire period and assume they are constant over time. However, the subscripts and do remain in the notation because the factors still have different values every year and for every company. The factor model is constructed by adding the three factors to the base model:

(4)

where the vectors , , and denote the factors sales, assets and productivity respectively.

Table I

Composition of the Factors Used in the Factor Model

This table denotes the variables included in each of the factors in the Confirmatory Factor Analysis. Eq. refers to the corresponding Equations in Appendix B with the detailed calculations and Load. is the factor loading.

Factor Sales Factor Assets Factor Productivity

Signal Eq. Load. Signal Eq. Load. Signal Eq. Load.

Accounts Receivable

(A10) 0.541 Accruals (A2) 0.960 Change in Asset Turnover

(A5) 0.488

Capital Expenditures

(A3) 0.184 Cash Flow (A4) -0.829 Change in Return on Assets

(A8) 0.519

Inventory (A9) 0.433 Change in Leverage

(A6) 0.135 Sales per Employee

(A11) 0.309

Operating Margin

(A10) -0.067 Change in liquidity

(A7) 0.446

A.3. Forecasting

All three models are used to forecast the future cumulative returns over different horizons. In line with the existing literature on return predictability, two types of forecasts are used: simple in-sample forecasts and out-of-sample forecasts. The in-sample forecasts are made by simply regressing the future returns on the explanatory variables using Equations 2-4. The out-of-sample forecasts are made by first estimating the coefficients of the model based on a sub-period of years called the estimation period and then these coefficients are used to estimate the Equations 2-4 in the evaluation or forecast period. The

(11)

motivation for this method is that the in-sample forecasts incorporate information beyond year in the coefficients to forecast the future return in year . With the out-of-sample forecasts the coefficients used to forecast the future returns in year do not include any observations beyond year and therefore no information beyond that year. Table II below reports the different estimation and forecast periods for both the in-sample and out-of-sample forecasts. The resulting estimates and goodness of fit statistics are believed to be a stronger test of predictability than the in-sample forecasts. This is however debated by for example Inoue and Kilian (2004) who state that the power of in-sample forecasts is actually greater. The in-sample forecasts generally yield more powerful results in terms of goodness of fit of the models, but they are biased in the sense that the estimation and forecasts periods coincide. The out-of-sample forecasts provide, at least theoretically, weaker results, and can be seen as a test of the practical relevance of the forecasts.

Table II

In-Sample and Out-of-Sample Estimation and Forecast Periods

This table reports the in-sample and out-of-sample periods used to forecast the stock returns. For the out-of-sample forecasts the first column reports the estimation period used to estimate the coefficients used in the forecast period. For the in-sample forecasts these periods are the same.

In-Sample Out-of-Sample

Estimation and Forecast Period Estimation Period Forecast Period

1992 – 2001 1982 – 1991 1992 – 2001

1992 – 2011 1982 – 1991 1992 – 2011

2002 – 2011 1992 – 2001 2002 – 2011

B. Goodness of Fit Statistics

The focus of this thesis is not on the statistical significance of the various individual signals and factors, but on how much the signals and factors contribute to the goodness of fit of the models as a whole. Therefore, the coefficients and t-statistics of the individual variables, factors, and signals are not reported in the main text. To reach a conclusion on the incremental effects on

(12)

the goodness of fit of the various signals and factors a goodness of fit statistic has to be constructed that can be compared between the different models.

Unfortunately, the nature of the data (i.e. panel data) does not allow for an easy comparison. Broadly speaking, there are three measures of goodness of fit for panel data. The problem lies in the measure of the total sum of squares (TSS) of a regression. In a simple ordinary least squares (OLS) regression the squared deviations from the average value of the dependent variable are simply summed to reach the TSS. But with a PLS model there are different methods to calculate the average. The first is the average of a single cross-section over all the periods, and thus comparing every observation against the average of the cross-section it belongs to: this is the within measure. The second measure, the between, uses the average of a single year over all the cross-sections. The third measure, the overall, is calculated by using the average over all periods and all cross- sections, this corresponds to the statistic of an OLS regression. There is no consensus in the literature on which measure of should be used for a PLS regression. In this paper the within is used following Campbell and Thompson (2008) and other papers that use this measure based on the recommendations from Wooldridge (2002) on panel data analysis.

If denotes the residual sum of squares and the total sum of squares than the is calculated as:3

(5)

∑ ∑( ̂ )

(6)

∑ ∑( ̅)

(7)

The focus will be on the within as mentioned above. For the in-sample forecasts the ̂ denotes the fitted values of the dependent variable and for the out-of-sample forecasts it denotes the forecasted values. The values of the in-

3 The subscript used to denote the forecasting horizon is dropped to simplify the notation.

(13)

sample deviate from the standard reported goodness of fit values by software packages such as EViews, the reason being that these use the normal OLS averages for the TSS and therefore report the overall, even when the PLS regression uses fixed effects. Both the in-sample and out-of-sample values of are manually calculated as the within to correct for this.

The construction of the TSS and hence the in this thesis deviates from these used by Goyal and Welch (2008) and many others in the sense that they use a rolling historical average. I propose another measure since the rolling historical average is conceptually a difficult measure, and using the average of an individual company in the subsample provides a more practical goodness of fit statistic. The out-of-sample goodness of fit measure is a measure of the ability of the proposed model to beat the average. And since this paper uses panel data with cross-section fixed effects it makes more sense to use an average of individual companies instead of a historical average. Another reason is that I would like to know whether the proposed models perform better compared to the average of the period under investigation rather than the historical average.

C. Alternative Specifications

In addition to the methodology described above three alternative methods are used to check the robustness of the analysis in order to be sure that the specifications and methods are not providing biased or inflated results. All the values of are also calculated as the overall. The calculation is almost similar, only the measure of the average to calculate the total sum of squares is different. The TSS used for the overall is calculated by summing the deviations from the overall average, just as in an OLS regression:

∑ ∑( ̅)

(8)

where the ̅ is simply the average return for all observations in the subsample.

The second alternative is the use of the adjusted measure. By construction,

(14)

the normal measure will never decrease when more regressors are added. So adding the factors or the signals will presumably increase the . The only exception to this would be when some observations are lost in an unbalanced sample, in that case the regular can in fact decrease. A widely accepted method to correct for this is the use of the adjusted . This version of the statistic is corrected for the decrease in degrees of freedom caused by the inclusion of more independent variables. Following the notation by Brooks (2008) it is calculated in the following way:

̅ [

( )] (9)

where ̅ denotes the adjusted version of , is the number of observations in the sample and is the number of regressors. This number of regressors includes the different intercepts for the cross section, i.e. the fixed effects.

Because the different intercepts for the different cross-sections are included as separate regressors, the values of the regular and the adjusted do not differ that much. The typical sample size for the different sub-samples ranges between 1,500 and 3,000 observations and includes between 150 and 250 different cross- sections. Adding three factors or eleven signals to such a large number of regressors has only a marginal effect in the calculations. The difference between the regular and the adjusted is small and does not change the results and is therefore not reported for all individual forecasts in order to keep the already large number of reported goodness of fit statistics as small as possible. Table AVII in Appendix D does provide the adjusted statistics for the forecasts on the entire period (1982-2011).

The third alternative specification used comprises only a small adjustment to Equations 2-4; namely replacing the intercepts by simply , identical intercepts for each cross-section. With these intercepts the equations are estimated using PLS without the fixed effects. The expected effects are that the in-sample goodness of fit will decrease but that the out-of-sample goodness of fit will increase. If adding more variables to a regression increases the , then using less variables should decrease the in-sample fit. The fixed effects in panel data

(15)

behave as separate variables for each cross-section, so logically the value of will decrease when there is only a single intercept. The out-of-sample goodness of fit will increase because there will be less coefficients to be fitted into the out-of- sample forecast period. Less coefficients means there are less sources of error and this will likely increase the out-of-sample .

D. Interpretation of the Results

How should the goodness of fit statistics be interpreted? In a normal OLS regression the reported is interpreted as the fraction of the variance in the dependent variable that is explained by the variance in the independent variables. This would naturally range between 0% and 100%. This interpretation holds for the in-sample overall measure as constructed above, but for the other specifications the interpretation is slightly different.

The in-sample within can be interpreted as the share of the variance in the future return of a single cross-section for a given period that can be explained by the model. So for the in-sample forecasts the values of are expected to be positive and an increase in reflects a better model.

The out-of-sample interpretation is somewhat different; it compares the RSS of the forecasts with the TSS of the actual observations. The model that produces the best forecasts would have a lower RSS and therefore a higher out-of-sample goodness of fit. The forecasts are compared with either the average actual observations for that specific cross-section (the within) or with the average of all cross-section (the overall) for a specific period. Emphasis should be placed on the specific period, the out-of-sample goodness of fit statistics in this thesis are informative on how well the models perform compared to the average in the forecast period. The models are not compared to the historical average, a procedure that is applied by Campbell and Thompson (2008) and Goyal and Welch (2008). The out-of-sample goodness of fit statistics can take any value: a negative value implies that the model does a worse job in forecasting than the (cross-section) average and a positive value reflects a better ability to forecast than the (cross-section) average for the period under investigation.

(16)

IV. Data

The data used to construct the signals are accounting information and return data. The data are collected from 1980 till 2011 because most of the variables were not available before 1980 for all companies. All data are end-of-year figures and the monetary data are in euros.4 Because some signals use one or two lagged values the starting point of the analysis is 1982. An overview of the companies is included in Appendix A. The eleven signals are based on the twelve signals used by Elleuch (2009), but they are somewhat adjusted to be more informative.

A. Variables

The dependent variable in the forecasts in Equations 2-4 is the cumulative total return of the individual companies. These companies are selected as the companies included in the EURO STOXX Index, an index of European stocks. See Appendix A for an overview of the industries and countries included in this index.

Most of the new signals are calculated as the change relative to the change in sales because in itself a change in for example the inventory or the accounts receivable is not informative, but relative to the change in sales one can make an informative statement about it. Some of the signals are also scaled by total assets in order to correct for the firm size, therefore no measure of firm size is added as a control variable.

All data are retrieved from Thomson Reuters DataStream, the variables in the base model were ready to use immediately, but the signals had to be calculated manually from the different variables. A detailed description of the calculations of the variables as well as some justifications are included in Appendix B. The eleven signals used are:

ACCOUNTS RECEIVABLE — The first signal measures the excess change in accounts receivable over the change in sales.

ACCRUALS — The accruals signal is defined as this year’s accruals divided by last year’s total assets. The accruals consist of the change in current assets, the

4 The data is downloaded from DataStream using the automatic currency conversion in euros.

(17)

change in cash and cash equivalents, the change in current liabilities, and the amortization and depreciation.

CAPITAL EXPENDITURES — Capital expenditures are used as a proxy for investments because this variable was widely available for the years and companies under investigation. The signal for capital expenditures measures the percentage change in capital expenditures in excess of the change in sales.

CASH FLOW — The cash flow signal is calculated as the current year’s cash flow divided by last year’s total assets. Where the cash flow is calculated as the net income before extraordinary items less the accruals.

CHANGE IN ASSET TURNOVER — The change in asset turnover is calculated as this year’s asset turnover minus last year’s asset turnover, where the asset turnover is calculated as the total sales divided by last year’s total assets.

CHANGE IN LEVERAGE — The change in leverage is measured by the change in the debt-to-assets ratio. This is calculated as the long term debt over the average total assets during a year.

CHANGE IN LIQUIDITY — The change in liquidity measures the increase in the current ratio of a firm, i.e. the ratio between the current assets and the current liabilities.

CHANGE IN RETURN ON ASSETS — The return on assets measures the net income before extraordinary items scaled by the total assets at the beginning of that year. The change in the return on assets is simply calculated as the absolute change in the return in assets.

INVENTORY — The size of the increase in inventory in excess of the change in sales can be informative about the profitability of a firm. The best measure of inventory in would be to only look at finished products, but since this is not available for all companies the total inventory measure is used instead.

OPERATING MARGIN — The relative change in operating margin is scaled by the relative change sales. The operating, rather than the gross, margin is used to focus on the core operating activities of a firm.

SALES PER EMPLOYEE — The last signal is analogous to the change in return on assets. The return on assets measures the change in productivity of the assets and the sales per employee measures the change in productivity of the

(18)

employees. It is measured by the relative change in the sales per employee.

B. Descriptive Statistics

The signals are calculated in such a way that the size of the companies is accounted for. The companies differ in size since there is data on 298 different companies between 1980 and 2011. But even after accounting for the size of the companies the signals and factors still show enough variation in the data to be used in the forecasts. The descriptive statistics of the data used to calculate the signals are included in Appendix B, Table III below provides the descriptive statistics for the three base variables, the eleven signals and the three factors.

Table III

Descriptive Statistics of the Base Variables, Signals, and Factors

This table provides the descriptive statistics of all the variables, signals and factors used in the forecasts in Equations 2- 4. The table shows the descriptive statistics after the outliers are deleted from the sample, see Table IV for more information on the number of outliers. Obs. is the number of observations and St. dev. is the standard deviation. The descriptive statistics of the variables used to calculate the eleven signals can be found in Table AIII in Appendix B. A

Panel A: Descriptive Statistics of the Base Variables

Obs. Mean Median Maximum Minimum St. dev.

Earnings Per Share 6,666 1.344 0.734 223.939 -122.100 6.197

Dividend Yield 5,630 3.185 2.650 43.450 0.000 2.806

Market-to-Book-Value 5,212 2.416 1.750 182.190 -76.260 4.044

Panel B: Descriptive Statistics of the Signals

Obs. Mean Median Maximum Minimum St. dev.

Accounts Receivable 5,422 0.012 -0.003 1.941 -1.964 0.296

Accruals 5,130 -0.060 -0.054 2.065 -1.707 0.115

Capital Expenditures 5,658 0.016 -0.019 2.198 -1.846 0.464

Cash Flow 5,125 0.110 0.097 1.715 -0.823 0.125

Change in Asset Turnover 6,174 -0.016 -0.003 1.816 -1.889 0.244

Change in Leverage 5,966 0.001 -0.002 0.935 -0.997 0.083

Change in Liquidity 5,159 0.000 -0.002 1.988 -1.959 0.363

Change in Return on Assets 6,243 0.000 0.001 2.109 -1.861 0.074

Inventory 5,378 -0.009 -0.012 1.958 -1.999 0.300

Operating Margin 6,331 0.009 0.084 19.799 -19.803 4.075

Sales per Employee 6,288 0.053 0.041 1.957 -0.988 0.215

Panel C: Descriptive Statistics of the Factors

Obs. Mean Median Maximum Minimum St. dev.

Assets 4,708 0.000 0.056 17.041 -10.690 0.967

Productivity 5,985 -0.005 0.004 6.361 -5.563 0.765

Sales 4,695 0.004 -0.029 5.452 -5.878 0.730

(19)

The difference in the number of observations between the variables has two reasons. For the base variables it is caused by missing data due to an unbalanced sample. But for the signals and the factors there is an additional reason; the calculations of the signals. Most signals are calculated with multiple variables, and whenever one of these variables is unavailable for a company in any year, there is no signal for that specific company in that year. The same holds for the factors, whenever one of the signals used to calculate a factor is not available there is no factor for that specific observation.

C. Outliers

To correct for extreme values in the data all the signals are checked for outliers.

Outliers occurred because most of the signals use a fraction or relative change in accounting data, which are sensitive to one-time events. Examples are large provisions taken by firms, lowering the operating margins or the rapid increases in accounts receivable and payable of young firms. Since these outliers would affect the results of the regressions and the goodness of fit statistics, they were deleted. To cope with the outliers a minimum and a maximum value are manually determined for every signal.

Table IV

Range and Excluded Outliers per Signal

This table reports the minimum and maximum values used to eliminate outliers in the signals. All observations that fall outside this range are deleted as an outlier. Both the absolute number of outliers and the % of observations are reported.

Signal Range Outliers (#) Outliers (%)

Accounts Receivable [-2, +2] 93 1.686

Accruals [-2, +2] 3 0.058

Capital Expenditures [-2, +2] 213 3.628

Cash Flow [-2, +2] 4 0.078

Change in Asset Turnover [-2, +2] 59 0.947

Change in Leverage [-2, +2] 0 0.000

Change in Liquidity [-2, +2] 58 1.112

Change in Return on Assets [-2, +2] 0 0.000

Inventory [-2, +2] 82 1.502

Operating Margin [-20, +20] 168 2.585

Sales per Employee [-2, +2] 29 0.459

(20)

For all signals, except the operating margin, the range was chosen to be [-2, +2]. Table IV above reports the ranges per variable and the number of observations lost by this method as well as the percentage of the observations lost. It can be seen than only a small fraction of the data is lost, but the effect on the distribution is large. Most of the signals now follow a somewhat normal distribution, which was not the case before the outliers were deleted, see Table III for the standard deviations.

D. Omissions

Apart from forecasting the returns directly, some papers use the risk-free rate to calculate the equity risk premium and test its predictability. In this paper however, the sample of companies is constructed with data from several countries so there is no clear way of dealing with the risk-free rate. Data on the 3-months interbank lending rates for all the countries, except Luxembourg, was collected, but unfortunately these figures did not have a history as long as the other variables. So in order to preserve data and to have more powerful regressions the risk-free rates are not used, but instead the returns are forecasted directly.

Another omission is the twelfth signal used by Elleuch (2009): the return on assets. The return on assets is excluded because it causes multicollinearity with the other signals. It is correlated with the change in the return on assets, defined as the absolute increase in the return on assets. This multicollinearity is caused by the fact that the return on assets of a firm is stable over time and will only converge to an industry average. These conclusions are found by Koller, Goedhart and Wessels (2010) when looking at data from the US between 1963 and 2008.

On the basis of their conclusions on stationarity I omit the return on assets as a signal.

V. Results

The results are discussed in four parts, starting with some general remarks and initial observations. Followed by the answers on the two questions addressed in

(21)

the Introduction: Do the accounting signals increase the predictability of stock returns and does the predictability increase with the forecast horizon? In the last part the alternative specifications mentioned in Section III and their impact on the results are studied.

A. General Remarks on the Coefficients and Initial Observations

When the individual coefficients of the base variables, the factors and the signals are investigated some observations immediately become apparent. The most important observation is that the base variables are always significant, the signals and the factors do not change this significance. To illustrate the behaviour of the individual coefficients, Table AVII in Appendix D reports all the coefficients of the different variables for the three models (Equations 2-4) for all three horizons for the entire period (1982-2011).

Despite the fact that the significance of the individual coefficients is not the subject of this thesis, there are some interesting observations to point out.

Including the signals or the factors does not change the significance of the three base variable; the earnings-per-share and the market-to-book value always have a negative coefficient and the dividend yield a positive coefficient. These signs are in line with the existing literature and are also robust to changes in the specifications and forecast horizons. At least for the in-sample forecast this is some indication that the factor models perform better than the base models and that the full models perform better than the factor models. This result holds for both the regular as well as the adjusted version of , it is a rough first proof that the fundamental signals have additional value in predicting stock returns.

Please note that Table AVII only reports in-sample goodness of fit statistics.

Since the factors are, at least statistically, a redundant specification it is to be expected that the factors are almost always insignificant for the entire period.

This is also true for the different sub-periods. Despite the insignificance, the factor models are still included in the analysis because they can be valuable in the out-of-sample forecast since they lower the number of variables. In addition to that, not all of the signals are significant for the entire period and the same holds for the individual sub-periods. However, the focus of this research is not on

(22)

the individual coefficients and their significance but on the goodness of fit of the forecasts of the entire models. Therefore the goodness of fit statistics are reported and discussed below in detail for every sub-period and specification.

In almost all of the in-sample forecasts the goodness of fit increases when the signals or the factors are added. The out-of-sample results are disappointing with mostly negative values of . And adding the signals or the factors even causes the goodness of fit to decrease. This poor out-of-sample fit is caused by the nature of the data and the method: panel least squares (PLS) regressions with cross- section fixed effects. The cross-section fixed effects are estimated out-of-sample by different intercepts for the different companies, and the more variables there are out-of-sample the poorer the fit. Some alternative specifications are used to improve the fit to some extent and these are discussed below.

B. The Effects of the Signals and Factors on the Goodness of Fit

At first sight, adding the signals to the base model appears to increase the goodness of fit of the predictions, with values of within for the in-sample forecasts ranging from 0.019 till 0.318 (mean: 0.095) for the base model and values of 0.053 till 0.270 (mean: 0.134) for the full model. The goodness of fit statistics for the factor models lie in between; ranging from 0.027 till 0.272 (mean: 0.104). Table V below provides all the values of within for the in- sample and out-of-sample forecasts based on Equations 2-4. These increases of the goodness of fit are not simply caused by the increase in the number of variables since they also hold for the adjusted measure.

The reported overall goodness of fit statistics are much higher and reach values ranging from 0.070 up to 0.836 (mean: 0.302) for the base model and 0.109 till 0.854 (mean 0.347) for the full model. The factor model ranges from 0.084 till 0.839 (mean: 0.340). See Table VI below for these results. It can be concluded that the first result holds for the in-sample forecasts for both measures of ; the goodness of fit increases when the factors or signals are added.

The results change when the out-of-sample is considered: all of the reported values of are negative and decrease when the signals or factors are added. In some forecasts the factor model predicts better, or actually less poor, than the full

(23)

model. Even using the overall does not correct for this and also results in negative values of , see again Table VI. This confirms earlier research by for example Goyal and Welch (2008) and Cochrane (2008) who find that that despite the fact that there is some in-sample predictability in stock returns, this is not confirmed out-of-sample. The negative values of in the out-of-sample forecasts are interpreted as the fact that the average return in the subsample explains more of the variance in the stock returns than any of the models do.

Table V

Goodness of Fit Measured as Within Including Fixed Effects

This table reports the in-sample and out-of-sample goodness of fit statistics for the three different models. Each panel reports the results on a different forecast horizon. The base model refers to Equation 2, the factor model to Equation 3 and the full model to Equation 4. The goodness of fit statistics are calculated as the within and the models include the cross-section fixed effects. The within is calculated as ∑ ( ̂ ) ( ̅) . For the in- sample forecasts the values should be between zero and one, but for the out-of-sample forecasts it can take any value where a negative value indicate a worse fit of the model than the average of the cross-section for a specific period.

Panel A: One Year Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.021 0.019 0.046 -3.539 -2.199 -0.423

Factor Model 0.043 0.027 0.081 -12.711 -7.244 -0.536

Full Model 0.082 0.053 0.116 -12.105 -6.957 -0.647

Panel B: Five Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.065 0.093 0.318 -3.255 -2.474 -0.711

Factor Model 0.091 0.089 0.272 -12.499 -9.452 -1.072

Full Model 0.171 0.140 0.210 -9.578 -7.468 -1.587

Panel C: Seven Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.026 0.022 0.245 -0.762 -0.827 -11.070

Factor Model 0.091 0.089 0.151 -2.274 -2.304 -19.472

Full Model 0.103 0.064 0.270 -1.857 -1.923 -43.282

It appears to be the case that the signals and factors increase the in-sample predictability of the stock returns but that this effect does not hold out-of-sample.

A likely explanation of the poor out-of-sample fit can be the cross-section fixed effects included in Equations 2-4, therefore an alternative specification without fixed effects is also tested. These results are discussed in the last subsection.

(24)

Table VI

Goodness of Fit Measured as Overall Including Fixed Effects

This table reports the in-sample and out-of-sample goodness of fit statistics for the three different models. Each panel reports the results on a different forecast horizon. The base model refers to Equation 2, the factor model to Equation 3 and the full model to Equation 4. The goodness of fit statistics are calculated as the overall and the models include the cross-section fixed effects. The overall is calculated as ∑ ( ̂ ) ( ̅). For the in- sample forecasts the values should be between zero and one, but for the out-of-sample forecasts it can take any value where a negative value indicate a worse fit of the model than the average of the period.

Panel A: One Year Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.126 0.070 0.142 -3.052 -2.035 -0.279

Factor Model 0.164 0.084 0.205 -10.985 -6.757 -0.329

Full Model 0.197 0.109 0.235 -10.456 -6.486 -0.426

Panel B: Five Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.308 0.263 0.536 -2.152 -1.826 -0.164

Factor Model 0.354 0.276 0.511 -8.602 -7.299 -0.393

Full Model 0.410 0.317 0.472 -6.524 -5.724 -0.739

Panel C: Seven Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.239 0.202 0.836 -0.376 -0.491 -1.621

Factor Model 0.354 0.276 0.839 -1.558 -1.714 -3.104

Full Model 0.299 0.232 0.854 -1.232 -1.401 -7.876

C. The Effects of the Forecasting Horizon on the Goodness of Fit

As mentioned in the literature review it is generally observed that a longer forecasting horizon is associated with a better fit of the forecasts. Again at first sight the generally accepted relation between the forecasting horizon and the goodness of fit seems to hold. According to both the within and overall a longer forecast horizon is associated with a better in-sample fit. Both the models with a five years and a seven years horizon are better forecasters than the one year horizon. There appears to be some ambiguity on what the best horizon is, sometimes the five and sometimes the seven years forecast horizon performs best. But it is clear that both are better than the one year horizon.

These results are far less clear out-of-sample, both the within and overall

(25)

provide disappointing out-of-sample results for the goodness of fit of the forecasts. The effect of the forecast horizon is indecisive: for some forecast periods and models the goodness of fit improves with a longer horizon and for some the goodness of fit deteriorates. No conclusion can be drawn for the forecast horizon for the out-of-sample forecasts.

D. Remarks on the Alternative Specifications

So far the use of the overall measure instead of the within has not changed the in-sample and out-of-sample results that much, of course the goodness of fit is better with the overall but the results in terms of the questions posed are unchanged. This changes when the third alternative specification is used; the forecasts without the cross-section fixed effects. See Table VII below for the goodness of fit without the fixed effects.

The in-sample goodness of fit shows only small changes when the factors or signals are added and the direction of the changes is also inconclusive. The same holds for the out-of-sample forecasts. An interesting point is that the goodness of fit is mainly the best for a horizon of five years, while the seven years horizon is mainly the worst. When the forecasts are performed without the fixed effects the results are indecisive and again no conclusion can be drawn from it.

The only positive point in this alternative specification is that the out-of-sample forecasts provide a much higher goodness of fit and in three of the 27 out-of- sample forecasts even provide a positive , meaning that the models perform better than the period average. As a final remark, none of the results are changed when the goodness of fit is reported as the adjusted as specified in Equation 9.

Furthermore, the main results change when the alternative specifications are used, but some general observations still hold: (i) adding the signals or factors increases the in-sample predictability of the returns, (ii) the in-sample predictability is larger than the out-of-sample predictability, and (iii) the five and seven years forecasting horizon provides better results than the one year horizon.

These observations do not hold for all the different forecasts, especially not for the forecasts without fixed effects, but a general rule of thumb.

(26)

Table VII

Goodness of Fit Measured as Overall Without Fixed Effects

This table reports the in-sample and out-of-sample goodness of fit statistics for the three different models. Each panel reports the results on a different forecast horizon. The base model refers to Equation 2, the factor model to Equation 3 and the full model to Equation 4. The goodness of fit statistics are calculated as the overall and the models do not include the cross-section fixed effects. The overall is calculated as ∑ ( ̂ ) ( ̅) . For the in-sample forecasts the values should be between zero and one, but for the out-of-sample forecasts it can take any value where a negative value indicate a worse fit of the model than the average of the period.

Panel A: One Year Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.004 0.014 0.028 -0.446 -0.268 0.006

Factor Model 0.019 0.017 0.030 -1.077 -0.620 0.005

Full Model 0.039 0.032 0.053 -1.151 -0.657 -0.014

Panel B: Five Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.028 0.051 0.106 -0.213 -0.173 0.062

Factor Model 0.035 0.041 0.085 -0.028 -0.205 -0.009

Full Model 0.051 0.046 0.093 -0.158 -0.180 -0.078

Panel C: Seven Years Forecast Horizon

In-Sample Goodness of Fit Out-of-Sample Goodness of Fit (1992 – 2001) (1992 – 2011) (2002 – 2011) (1992 – 2001) (1992 – 2011) (2002 – 2011)

Base Model 0.011 0.010 0.034 -0.060 -0.127 -0.180

Factor Model 0.016 0.011 0.032 -0.013 -0.082 -0.543

Full Model 0.020 0.014 0.048 -0.025 -0.105 -0.607

VI. Conclusion and Limitations

The in-sample goodness of fit of the forecasted stock returns is found to increase when the accounting signals are included. The same effect is found for the three factors of firm performance. The in-sample goodness of fit also increases when the forecast horizon is changed from one year to five or seven years. These results are robust to some changes in the specifications: using the overall instead of the more conservative within measure does not change these in- sample results. Excluding the fixed effects lowers the goodness of fit statistics and changes the observed patterns When the forecasts are performed out-of- sample these results also change. There does appear to be a small increase in the

Referenties

GERELATEERDE DOCUMENTEN

The field study took place in Sierra Leone and was divided into two phases. It is to mention that the first phase was not planned and remained quite non- purposeful in view

Second, we regress the NYSE listed banks’ daily unadjusted- and mean adjusted returns against four sets of dummy variables (which are combinations of non–financial

cumulative returns for the Japanese stock index and the cumulative returns of a long-short portfolio for he portfolio strategy is to go long in the two best performing sectors

In doing so a collar weighted portfolio weights some stocks using market capitalization and some stocks using the value of the upper or lower boundary of the collar around

Psychological literature predicts that the result of international football matches causes an up- or down beat in mood. This thesis examines the relationship between

official interest rate cuts give significant results for the Euro zone. The medium and

Decoupled Model using Euler Angles and Body Coordinated Position Errors: xlong = ulong xlat ulat.. The kinematics and force model are combined to form the complete

week period of treatment. Aloe ferox gel material showed a 1.1% increase in skin hydration after 1 week of treatment; but thereafter also exhibited a dehydrating effect