• No results found

European Mutual Fund Performance Attribution Field Keywords

N/A
N/A
Protected

Academic year: 2021

Share "European Mutual Fund Performance Attribution Field Keywords"

Copied!
39
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

European Mutual Fund Performance Attribution

Field Keywords

Mutual fund, factor model analysis, security selection ability, market timing ability, performance attribution By: Jan Sloots1 University of Groningen Faculty of Economics and Business

MSc. Finance

Supervisor: Dr. J. J. Bosma Date: 06-06-2019

Abstract

This study analyses mutual fund performance in the European market by applying a model that performs well in measuring both security selection and market timing abilities. Thereafter, the impact of a fund’s market timing ability, expense ratio, total net assets under management and age on its adjusted performance are examined. The overall results suggest that European mutual funds underperform the market and do not exhibit market timing abilities. Furthermore, expenses and market timing abilities have a significant negative impact on a fund’s adjusted performance. A fund’s total net assets under management and age do affect its adjusted performance as well, but in a less pronounced fashion. On average, a European mutual fund tends to outperform a United States’ mutual fund.

Word count: 12403 (*excluding Appendix)

(2)

1. Introduction

Mutual funds have become increasingly popular over time, drawing attention from both investors and academics. Even decades after the first investment hit records in the early 1980s, worldwide net cash flow to funds increase from year to year; leading to an all-time high of EUR 46.64 trillion worth of assets under management in the third quarter of 2018.2 Easy access,

cheap and fast portfolio diversification, economies of scale and professional management has led many investors to invest in mutual funds, but does a mutual fund actually add value? Many academics have analysed whether mutual fund managers are indeed able to provide added value to investors. Often, however, the average mutual fund tends to underperform a naïve random selection buy and hold strategy and miss time market movements.

Most studies on mutual fund performance attribution are focussed on the United States’ market. Although the United States has always been front player in the market of mutual funds, the European market has experienced large inflows compared to the United States during the last five to ten years. The purpose of this study is to analyse mutual fund performance in the (largely) unexploited European market. Where most studies measure stock selection and market timing abilities separately, this study combines traditional proven measures to a model that performs well in measuring both skills. Besides examining mutual fund performance, this study applies a model that allows us to assess both the impact of the measured market timing abilities and several fund characteristics on a fund’s performance. Thereby further looking into what factors attribute to mutual fund performance.

For this study 1,690 active, merged and dead mutual funds that are domiciled in Europe and invest in European equity are examined over a 10-year period from 2009 to 2018. The study shows that a combination of both the Carhart (1997) four factor model and the Henriksson and Merton market timing model (1981) outperforms traditional widely used models in explaining mutual fund performance. Furthermore, when the model is applied to the sample, I find that most mutual funds indeed underperform the market, and do not exhibit any market timing ability. Where, in favour of the added value of mutual funds, more evidence for positive market timing ability is found.

By applying a cross-sectional model on the sample data, I demonstrate that expenses and market timing abilities have a significant negative impact on a fund’s adjusted performance. Moreover, I find some evidence of a relation between the fund’s performance and total net assets under management and between the fund’s performance and age.

Finally, looking at the differences in performance, on average a European mutual fund tends to outperform a United States’ mutual fund. This is reflected by both a higher alpha and market timing factor, measuring the fund’s security selection ability and market timing ability

2 Efama, Worldwide Regulated Open-ended Fund Assets and Flows, December 2018 <

(3)

respectively. Differences in regulation might partly explain these inequalities, as United States’ mutual funds follow stricter regulations than European mutual funds.

This paper is organized as follows. Section 2 discusses the relevant literature on mutual fund performance. Section 3 describes the methodology of the study. Section 4 presents the data and descriptive statistics. Section 5 describes the results, and section 6 concludes the paper.

2. Literature review

2.1. Regulations for European and United States’ mutual funds

Where United States’ mutual funds follow the Investment Company Act (ICA), European mutual funds follow either the Undertakings Collective Investment in Transferable Securities (UCITS) directive or the Alternative Investment Funds Management (AIFM) directive. The UCITS directive is the main European framework covering collective investment schemes, which are allowed to sell shares to any investor worldwide. The AIFM directive covers alternative investment schemes, which are only allowed to sell shares to certain professional investors. AIFM funds are therefore less restricted in their borrowings and leverage and do not have to publish as much information as UCITS funds.

Differences in regulations between the United States and Europe are most likely of influence on mutual fund performance in these regions. In general, United States’ mutual fund regulations are stricter than European mutual fund regulations. United States’ mutual funds are strictly limited in the use of senior securities including various derivative instruments and short selling, where European mutual funds are allowed to engage in complex capital structures and significant short positions. However, where United States’ mutual funds can borrow up until a third of their net asset value, European mutual funds’ borrowings are limited to 10% of their net asset value. Furthermore, United States’ mutual funds follow stricter rules concerning diversification and are limited in the proportions they are allowed to invest in other funds. Finally, United States’ mutual funds follow stricter income distribution legislation, which leads to unfavourable taxation. The income distribution legislation requires United States’ mutual funds to distribute all realized capital gains to their investors every year. Thereby requiring these investors to pay tax, even though they did not sell any shares. In contrast, European mutual funds are not required to distribute their realized capital gains to their investors every year. By retaining realized capital gains, a European mutual fund increases its share price and prevents its investors from paying excessive tax.

2.2. Efficient market hypothesis

(4)

information. Markets, thus, respond to new information in a quick and efficient manner and any information that arises will be incorporated in the security price without a delay.

Fama (1970) describes three forms of efficient markets; the weak form, the semi-strong form and the strong form. In the weak form, security prices only reflect historical prices. In the semi-strong form, security prices reflect all information that is publicly available. In the strong form, security prices reflect any information relevant for price formation, whether public or private. As efficient markets respond to new information in a quick and efficient manner, mutual fund managers should not be able to beat the market by either stock selection or market timing.

2.3. Security selection performance models

Various models for measuring performance have been developed over time. Prominent financial literature discusses the Nobel prize awarded Markowitz (1952) portfolio theory at length. However, as time passes other investment concepts have emerged that are tangential to Markowitz’s theory. Recent financial literature evaluates a mutual fund’s performance through factor model analysis, in which excess return is attributed to the portfolio manager’s security selection ability and market timing ability.

The traditional, but still widely used, approach to measure portfolio performance is the capital asset pricing model (CAPM), first elaborated by Sharpe in 1964. The CAPM gives the expected one period return 𝐸(𝑅𝑖), on any security (or portfolio) i by

𝐸(𝑅𝑖) = 𝑅𝑓+ 𝛽𝑖 (𝐸(𝑅𝑚) − 𝑅𝑓), (1)

where 𝑅𝑓 is the one-period risk free interest rate, 𝛽𝑖 is the measure of systematic risk of security i and 𝐸(𝑅𝑚) − 𝑅𝑓 is the expected one-period excess return on the market portfolio. In order to be able to measure the performance of mutual funds, Jensen (1968) transforms equation (1) to an ex post variant that allows for multiperiod forecasting abilities. Jensen (1968) measures the mutual fund’s performance by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝜖𝑖𝑡, (2)

(5)

One of the major critiques on the use of the CAPM was brought forward by Roll (1977), who questioned the validity of empirical tests of the CAPM as they are not based on the true market portfolio. According to Roll (1977), a true market portfolio has a positive proportion invested in every individual asset worldwide. Stambaugh (1982), however, studied the sensitivity of tests of the CAPM to different sets of asset returns and shows that they are not very sensitive to the composition of the market portfolio. Additionally, Henriksson (1984) argued that mutual fund managers are concerned with the universe of securities that they intend to invest in rather than attempting to forecast the true market portfolios’ return.

The beta, 𝛽𝑖, of the single market index, however, is found to contain little information about average returns (e.g., Fama and French, 1992; Chan, Jegadeesh and Lakonishok, 1996) and as there is a wide variety of investment strategies applied by mutual funds, it is preferable to use a multi-factor model that better explains mutual fund investment behaviour. Fama and French therefore extended the CAPM in 1993 to the Fama and French (hereafter FF) three-factor model. The FF three-three-factor model adds two empirically determined variables to the CAPM to better explain the cross-section of average returns. The FF three-factor model is given by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝜖𝑖𝑡, (3) where 𝑆𝑀𝐵𝑡 measures the size effect by the return spread of small minus big stocks and 𝐻𝑀𝐿𝑡 measures the value effect by the return spread of high minus low book-to-market equity ratio stocks. If we do not take a fund’s portfolio characteristics into account, we could incorrectly assign stock-picking skills, while the fund followed a strategy of buying small, high book-to-market equity ratio stocks, which on average outperform the equity book-to-market as a whole.

Although the FF three-factor model performs better than the CAPM in explaining the cross-section of average returns, it is not able to explain short-term persistence in mutual fund returns. Carhart (1997) therefore extends the FF three-factor model by adding the Jegadeesh and Titman (1993) momentum factor. The Carhart four-factor model is given by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖(𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝛽4𝑖 𝑊𝑀𝐿𝑡 + 𝜖𝑖𝑡, (4) where 𝑊𝑀𝐿𝑡 measures the short-term persistence in returns (momentum) by the return spread

of last year’s winning minus losing stock. Carhart (1997) finds that the four-factor model almost completely explains persistence in mutual fund returns. The only significant persistence in returns that he cannot explain is that in strong underperformance by worst-return mutual funds.

(6)

average returns left unexplained. The five factor model measures return from security selection by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝛽4𝑖 𝑅𝑀𝑊𝑡

+ 𝛽5𝑖 𝐶𝑀𝐴𝑡+ 𝜖𝑖𝑡, (5)

where 𝑅𝑀𝑊𝑡 measures the return spread of robust profitability stock minus weak profitability stock and 𝐶𝑀𝐴𝑡 measures the return spread of low investment (conservative) stock minus high investment (aggressive) stock. Fama and French (2015) do not show results for the FF five-factor model including the Jegadeesh and Titman (1993) momentum five-factor, as the five-factor has a regression slope close to zero and so induces little changes in the performance of the model.

2.4. Market timing performance models

There are two models that are widely used in financial literature to measure for market timing abilities. The oldest model is introduced by Treynor and Mazuy in 1966, who reformulated the CAPM to a quadratic equation that regresses a portfolio’s excess return onto the excess return of the stock market and the stock market’s excess return squared,

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛾𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡)2+ 𝜖𝑖𝑡, (6) where 𝛽𝑖 and 𝛾𝑖 are regression coefficients on the excess return from the market index and the market’s excess return squared respectively. The fund’s portfolio manager anticipates whether the general stock market is going to rise or fall and will shift the fund’s portfolio to high-beta or low-beta stock accordingly. A negative 𝛾𝑖 coefficient is caused by incorrectly forecasting market movements and indicates negative timing ability. Similarly, a zero 𝛾𝑖 coefficient indicates no timing ability and a positive 𝛾𝑖 coefficient indicates positive timing ability. Treynor and Mazuy (1966) estimated equation (6) for 57 mutual funds and only found one 𝛾𝑖

coefficient significantly different from zero. As none of the 𝛾𝑖 coefficients were significantly larger than zero, Treynor and Mazuy (1966) assumed that none of the funds could time the market.

A similar methodology was proposed by Henriksson and Merton (1981), who measure market timing ability by assuming that a fund’s portfolio manager either forecasts 𝑅𝑚𝑡> 𝑅𝑓𝑡 or 𝑅𝑚𝑡< 𝑅𝑓𝑡. They propose the following equation;

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽𝑖(𝑅𝑚𝑡− 𝑅𝑓𝑡) +𝑖 𝐷𝑡 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝜖𝑖𝑡, (7) where the 𝑝 coefficient is similar to the 𝛾𝑝 coefficient of the model by Treynor and Mazuy

(7)

positive and 9 negative 𝑝 coefficients that were significant. Like Treynor and Mazuy (1966), Henriksson (1984) found little evidence that mutual fund managers time the market.

2.5. Review of empirical studies

Several empirical studies on mutual fund performance have been conducted in the past decades. The focus of a (large) majority of studies lies on performance of mutual funds domiciled in the United States. In order to sketch the expectations for this study, I review the results of empirical studies on the performance of mutual funds in the United States and Europe and the differences between those results.

Before the mid 80s, most academic studies on United States’ mutual fund performance (e.g., Sharpe, 1966; Jensen, 1968; McDonald, 1974) conclude that mutual funds on average are not able to outperform a comparable passive market proxy. Where Sharpe (1966) and Jensen (1968) found negative alphas, evidence that mutual funds underperform their comparable passive market proxy, McDonald (1974) found on average no under- nor outperformance. Thereafter, some contradicting evidence was found by Grinblatt and Titman (1989) who show superior performance for low asset value, aggressive-growth and growth funds. This outperformance, however, disappears when all expenses are taken into account. Similarly, Wermers (2000) finds outperformance of 1.3% before expenses but finds underperformance of 1% after expenses. In contrast to those studies, Ippolito (1989) found positive alphas for his sample of mutual funds over the period 1965-1984. However, for the majority of the funds in the sample, the positive alphas are not sufficiently large to overcome the fund’s load charges. A more recent study by Cremers and Petajisto (2009) shows for the period 1980-2003 that only mutual funds with small assets and a high active share are able to outperform a comparable market proxy both before and after expenses. Petajisto (2013) thereafter slightly changed the methodology of Cremers and Petajisto (2009) and added six more years to the sample. Petajisto finds that there are some market inefficiencies that can be exploited by active stock selection and shows that the most active stock pickers beat their indices by about 1.26% after all fees and expenses.

(8)

non-S&P 500 assets in mutual fund portfolios and show significant positive market-timing abilities for aggressive, small-company, and growth investment objective funds, negative market-timing abilities for equity-income funds and no market-timing abilities for balanced funds.

As the abovementioned studies show, results for United States’ mutual fund performance are mixed. In general, however, a United States’ mutual fund seems to slightly outperform a comparable market proxy before expenses and underperform after expenses. As for market timing abilities, some academics find significant market timing abilities. However, only one of the studies could show significant positive timing abilities on average.

Few studies are done on the European mutual fund market as a whole. Otten and Bams (2002) investigated mutual fund performance using a sample of 506 European mutual funds investing domestically. Their results contradict the overall result from the United States’ mutual fund market and suggest that European mutual funds, small cap funds in particular, are able to outperform, indicated by their after-cost positive alphas. Vidal-Garcia (2013) examined persistence and performance of European equity mutual funds domesticated in Europe and found strong evidence of significant performance persistence. Vidal-Garcia’s findings concerning the performance of mutual funds, however, contrast the findings of Otten and Bams as their results show that on average mutual fund underperform at any time horizon considered. Although no study is done on market timing abilities for the European mutual fund market as a whole, Romacho and Cortez (2006) found no significant market timing abilities for Portuguese based mutual funds investing in local, European and International equity. Like Ramacho and Cortez, Matallín-Sáez (2006) finds on average no timing abilities for Spanish mutual funds that invest domestically. Both studies have in common that on average market timing ability does not exist, but there is greater evidence of negative market timing ability. Cuthbertson, Nitzsche and O’Sullivan (2010), however, find that United Kingdom equity mutual funds do miss time the market on average. More specifically, 78% of the funds in their sample show negative market timing parameters, of which 19% are statistically significant.

(9)

3. Methodology

This section and the results section are divided in two parts. The first part is related to a factor model used to measure for the aforementioned security selection and market timing abilities of mutual fund managers. The second part is dedicated to the impact of a fund’s market timing ability and several fund characteristics on the adjusted performance measured in the first part.

3.1. Mutual fund security selection and market timing abilities

The aim of the first part of the study is to design a model that performs well in measuring both security selection and market timing abilities. Jensen’s CAPM (2), Fama and French’s three factor model (3), Carhart’s four factor model (4) and Fama and French’s five factor model (5) do not include a factor that measures market timing abilities and can thus only be used to measure security selection abilities. Although the alphas of the market timing models of Trenor and Merton (6) and Henriksson and Merton (7) can be interpreted as the fund manager’s security selection ability, these models suffer from several problems documented in financial literature. One of these problems is the passive timing affect which is linked to the characteristics of the stocks in which the funds invest (see, e.g., Jagannathan and Korajcyzk, 1986; Bollen and Busse, 2001; Matallín-Sáez, 2006). In order to work around these problems, many academics combine the power of proven models (e.g., Andreu, Matallín-Sáez and Sarto, 2018; Matallín-Sáez 2006, among others) to come up with a model that performs well in measuring both security selection ability and market timing ability.

In line with previous literature, this study comes up with such a model by combining the model that performs best in measuring performance with the model that performs best in measuring market timing ability. Many forms of data have been used in the literature to test for mutual fund security selection and timing abilities. For that reason, in this paper, I first calculate the discussed traditional factor performance models and market timing models and combine the models that fit the data best by comparing their adjusted 𝑅2 and the loglikelihoods. The

applied model is given by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝛽4𝑖 𝑊𝑀𝐿𝑡

+𝑖 𝐻𝑀𝑡+ 𝜖𝑖𝑡. (8)

In order to be able to calculate the fund’s return, 𝑅𝑖𝑡, I assume that all realized capital gains are

reinvested in the fund, which is often the case for European mutual funds due to the favourable tax treatment discussed in Section 2. 𝑅𝑖𝑡 is then given by

𝑅𝑖𝑡 = 𝑁𝐴𝑉𝑖,𝑡

(10)

where 𝑁𝐴𝑉𝑖 is the fund’s net asset value per share calculated by

𝑁𝐴𝑉𝑖𝑡 = 𝑀𝑎𝑟𝑘𝑒𝑡 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑎𝑠𝑠𝑒𝑡𝑠 𝑎𝑡 𝑡−𝐹𝑢𝑛𝑑𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠ℎ𝑎𝑟𝑒𝑠 𝑜𝑢𝑡𝑠𝑡𝑎𝑛𝑑𝑖𝑛𝑔 𝑎𝑡 𝑡𝑖𝑚𝑒 𝑡′𝑠𝑡𝑜𝑡𝑎𝑙 𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠 𝑎𝑡 𝑡. (10)

The U.S. one-month T-bill rate is taken as a proxy for the risk-free rate, 𝑅𝑓𝑡. The excess return on the market portfolio, 𝑅𝑚𝑡− 𝑅𝑓𝑡, is the return on a European value-weight market portfolio minus the U.S. one-month T-bill rate. The size effect factor, 𝑆𝑀𝐵𝑡, and value effect factor,

𝐻𝑀𝐿𝑡, are the region’s Fama and French’s (1997) factor-mimicking portfolios for size and book-to-market equity. The momentum factor, 𝑊𝑀𝐿𝑡, is the region’s Carhart factor-mimicking portfolio for lagged momentum. Finally, 𝐻𝑀𝑡, is the region’s Henriksson and Merton’s market timing factor. The constant (or alpha) of the model, 𝛼𝑖, then shows the fund’s

security selection ability and the coefficient 𝑖 shows the fund’s market timing ability. The factors 𝑅𝑓𝑡, 𝑅𝑚𝑡− 𝑅𝑓𝑡, 𝑆𝑀𝐵𝑡, 𝐻𝑀𝐿𝑡 and 𝑊𝑀𝐿𝑡 are obtained from the Kenneth R. French Data Library and are stated in U.S. dollar returns. The market timing factor 𝐻𝑀𝑡 is calculated as

𝐻𝑀𝑡 = 𝐷𝑡 (𝑅𝑚𝑡− 𝑅𝑓𝑡), (11)

where 𝑅𝑚𝑡− 𝑅𝑓𝑡 is the market portfolio’s excess return of combined time-series model (8) and

𝐷𝑡 is a dummy variable that takes on a value of 0 if 𝑅𝑚𝑡 > 𝑅𝑓𝑡 and a value of -1 if 𝑅𝑚𝑡 < 𝑅𝑓𝑡. Like Hendricks, Patel and Zeckhauser (1993) and Carhart (1997) and similar to more recent literature (e.g., Cuthbertson, Nitzsche and O’Sullivan, 2010; Petajisto, 2013; Fama and French, 2015), I form portfolios of mutual funds and estimate performance on the resulting portfolios. This creates an aggregate picture of mutual fund performance. On December 31 of each year, I form 10 equal-weighted portfolios of mutual funds based on calculated returns that year (9), 2 equal-weighted portfolios based on fund identity and 1 equal-weighted portfolio of all funds. Portfolio 1 (high) is formed of the 10% mutual funds with the highest past 12-month return and portfolios 2 to 10 (low) are formed of the second best 10% to the worst 10% performing mutual funds. This returns a time series of monthly returns on each decile portfolio from 2009 to 2018. The equal-weighted portfolios based on fund identity are two portfolios of mutual funds that are labelled as funds that follow the UCITS directive and funds that follow the AIFM directive.

3.1.1. Robustness tests of the combined time-series model (8)

To test the validity of the factor models and market timing models and in particular to test the validity of the combined time-series model (8), I conduct both a cross-sectional regression test as two time-series regression tests.

(11)

model using time-series return observations available up to month 𝑡 − 1. In the second stage, for each month over the test period (an extended version of) the following cross-sectional regression is estimated,

𝑅𝑖𝑡 = 𝛾0𝑡 + 𝛾1𝑡 𝛽̂𝑖𝑡−1+ 𝛾2𝑡 𝛽̂𝑖𝑡−12 + 𝛾3𝑡 𝑠̂𝑖𝑡−1(𝜀̂) + 𝜂𝑖 𝑖𝑡, (12)

where 𝑅𝑖𝑡 is the return on portfolio 𝑖 in month 𝑡, 𝛽̂𝑖𝑡−1 is the estimated factor beta for portfolio 𝑖 in month 𝑡 − 1, 𝑠̂𝑖𝑡−1(𝜀̂) is the estimated residual standard deviation for portfolio 𝑖 in month 𝑖 𝑡 − 1 and 𝜂𝑖𝑡 is the random error term. The null hypothesis that 𝛾𝑗 = 0 for 𝑗 = 0, 1, 2, 3 is tested by the following t-statistic,

𝑡(𝛾̂̅𝑗) = 𝑠(𝛾̂̅𝛾̂̅𝑗

𝑗)/√𝑇,

(13)

where 𝑠(𝛾̂̅𝑗) is the standard deviation of the estimated gamma coefficient 𝛾̂̅𝑗, and T is the number of estimated gamma coefficients. Intuitively, the betas should be the only relevant measures. So 𝛾0 should not be significantly different from zero and 𝛾1 should be significantly different from zero.

Gibbons, Ross and Shanken (1989) propose a time-series regression test by assuming that the residual return of the test assets in equation (2) is multivariate normal distributed. Like the test procedure of Fama and Macbeth, the test is originally invented to test the CAPM but can be extended to test for models with more than one factor. Where the extension of the Fama and Macbeth test is straight forward, the extension of this test is more technical in nature and therefore further explained. Gibbons, Ross and Shanken test simultaneously the hypothesis that all alphas are zero by the following F-statistic with degrees of freedom 𝑁 and (𝑇 − 𝑁 − 1),

𝐹𝐺𝑅𝑆= [𝑇(𝑇−𝑁−1)𝑁(𝑇−2) ] (𝛼̂𝑝

̂−1𝛼̂𝑝

1+ 𝜃̂𝑝2 ), (14)

which can be extended to a multivariate case by the following F-statistic with degrees of freedom 𝑁 and (𝑇 − 𝑁 − 𝐿),

𝐹𝐺𝑅𝑆= (𝑁𝑇) [𝑇−𝑁−𝐿𝑇−𝐿−1] ( 𝛼̂𝑝′∑̂−1𝛼̂𝑝

1+ 𝑟̅𝑝Ω̂−1𝑟̅𝑝), (15)

where T is the number of time-series return observations, N is the number of test assets, L is the number of factors, 𝛼̂𝑝′ = (𝛼̂1, 𝛼̂2, … , 𝛼̂𝑁), ∑̂ is a covariance matrix of the residual returns, 𝑟̅𝑝

is a vector of mean factor returns and Ω̂ is the variance-covariance matrix for the factor returns. Thus, a large value of the test’s 𝐹𝐺𝑅𝑆 statistic rejects the validity of the factor model. Additionally, a model with a lower 𝐹𝐺𝑅𝑆 statistic is considered an improvement when compared

(12)

Finally, I check whether the Greek financial crisis affects my data by adding a dummy variable for August 2010. The choice for August 2010 is based on the outcomes of a structural break test proposed by Chow (1960), which will be shown in Section 3. The applied model is given by

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝛽4𝑖 𝑊𝑀𝐿𝑡

+𝑖 𝐻𝑀𝑡+ 𝛾𝑖 𝐷𝑡 + 𝜖𝑖𝑡, (16) where the calculation and interpretation of all components are similar to that of the combined time-series model (8) and 𝐷𝑡 is a dummy variable that takes on value 0 before August 2010 and value 1 from August 2010 onwards. The significance of the dummy variable and the impact of the variable on the regression results will tell us whether parameter stability is an issue.

3.2. The influence of market timing ability and fund characteristics on adjusted performance In the first part of this study, I analyse how well mutual funds time the market and if they are able to outperform their comparable market proxy. The aim of this follow-up part of the study is to analyse whether the measured market timing ability impacts a mutual fund’s adjusted performance. Besides studying the impact of market timing abilities on adjusted performance, I assess the impact of several fund characteristics. Fund managers often claim that expenses do not reduce performance, in the sense that good management costs money. One would therefore expect higher returns for higher management expenses. Carhart (1997), Wermers (2000) and Cremers and Petajisto (2006), however, found a strong negative relation between the two. In contrast to expense ratios, total net assets and age might positively influence the performance of a fund due to economies of scale and the value of experience respectively. The following cross-sectional model is applied to assess the influence of market timing ability and fund characteristics on adjusted performance,

𝛼𝑖 = 𝑐0 + 𝑐1 𝐻𝑀𝑖 + 𝑐2 𝑇𝐸𝑅𝑖 + 𝑐3 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖 + 𝑐4 𝐿𝑁 𝐴𝑔𝑒𝑖+ 𝜖𝑖, (17)

where 𝛼𝑖, the adjusted performance, is the alpha of the combined time-series model (8), 𝐻𝑀𝑖 (11) is the market timing factor of the combined time-series model (8), 𝑇𝐸𝑅𝑖 is the fund’s total

(13)

On December 31 of the last five years of the sample, I regress the combined time-series model (8) over the past 12 months for all mutual funds to obtain each fund’s alpha, 𝛼𝑖, and

market timing factor 𝐻𝑀𝑖. The fund’s characteristics 𝑇𝐸𝑅𝑖, 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖 and 𝐿𝑁 𝐴𝑔𝑒𝑖 are also measured on December 31, where 𝑇𝐸𝑅𝑖 is calculated over the past 12 months. This returns a cross-section of alphas, market timing factors and fund characteristics from 2014 to 2018.

3.2.1. Robustness tests of the cross-sectional model (17)

As a first robustness test of the cross-sectional model (17), I compare its results to that of a similar cross-sectional model with the market timing factor 𝑇𝑀𝑖 instead of 𝐻𝑀𝑖. First, I adjust the combined time-series model (8) to the following model to retrieve a comparable market timing factor

𝑅𝑖𝑡− 𝑅𝑓𝑡 = 𝛼𝑖+ 𝛽1𝑖 (𝑅𝑚𝑡− 𝑅𝑓𝑡) + 𝛽2𝑖 𝑆𝑀𝐵𝑡+ 𝛽3𝑖 𝐻𝑀𝐿𝑡+ 𝛽4𝑖 𝑊𝑀𝐿𝑡

+ 𝛾𝑖 𝑇𝑀𝑡+ 𝜖𝑖𝑡, (18)

where the calculation and interpretation of all components are similar to that of the combined time-series model (8), but 𝐻𝑀𝑖 is replaced by 𝑇𝑀𝑖. The market timing factor 𝑇𝑀𝑖 is calculated as

𝑇𝑀𝑖 = (𝑅𝑚𝑡− 𝑅𝑓𝑡)2, (19)

where 𝑅𝑚𝑡− 𝑅𝑓𝑡, is the return on a European value-weight market portfolio minus the U.S. one-month T-bill rate. Thereafter, the following model is applied to assess the influence of market timing ability and fund characteristics on adjusted performance,

𝛼𝑖 = 𝑐0 + 𝑐1 𝑇𝑀𝑖 + 𝑐2 𝑇𝐸𝑅𝑖+ 𝑐3 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖 + 𝑐4 𝐿𝑁 𝐴𝑔𝑒𝑖+ 𝜖𝑖, (20)

where the calculation and interpretation of all components are similar to that of the cross-sectional model (17), but 𝐻𝑀𝑖 is replaced by 𝑇𝑀𝑖, which is the market timing factor of the combined time-series model (18). If the results of cross-sectional model (20) are both similar in impact and in sign to that of cross-sectional model (17), the robustness test provides evidence in favour of the validity of the regression results.

As a second robustness test of the cross-sectional model (17), I compare its results to that of a similar cross-sectional model without the market timing factor. A model that, thus, only tests the influence of fund characteristics on adjusted fund performance. The following model is applied

(14)

where the calculation and interpretation of all components are similar to that of cross-sectional model (17). Note that the market timing variable 𝐻𝑀𝑖 is removed. If the results of cross-sectional model (21) are both similar in impact and in sign to that of cross-cross-sectional model (17), the robustness test provides evidence in favour of the validity of the regression results.

As a final robustness test of the cross-sectional model (17), a panel regression is conducted over the time period. A common phenomenon in scientific research, in particular when several cross-sectional regressions are conducted, is that there appears to be a trend when different samples are assessed but this trend disappears or reverses when these samples are combined by, for example, a panel regression. This phenomenon was first described by Simpson (1951) and is later named the Simpson’s Paradox (Blyth, 1971). To come up with the best panel regression model, multiple Hausman (1978) specification tests and dummy variable significance tests are conducted. Accordingly, the following random effects model is applied

𝛼𝑖𝑡 = 𝑐0+ 𝑐1 𝐻𝑀𝑖𝑡+ 𝑐2 𝑇𝐸𝑅𝑖𝑡+ 𝑐3 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖𝑡 + 𝑐4 𝐿𝑁 𝐴𝑔𝑒𝑖𝑡 + 𝛾1𝐷2014𝑡

+ 𝛾2𝐷2015𝑡+ 𝛾3𝐷2016𝑡+ 𝛾4𝐷2017𝑡 + (𝜇𝑖+ 𝜖𝑖𝑡), (22)

where the calculation and interpretation of the variables 𝛼𝑖𝑡, 𝐻𝑀𝑖𝑡, 𝑇𝐸𝑅𝑖𝑡, 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖𝑡, 𝑇𝐸𝑅𝑖𝑡

and 𝐿𝑁 𝐴𝑔𝑒𝑖𝑡 are similar to that of cross-sectional model (17), the variables 𝐷2015𝑡, 𝐷2016𝑡 and 𝐷2017𝑡 are dummy variables for the years 2015 to 2017 respectively to deal with time-fixed effects, 𝜇𝑖 is the unobserved random effect that varies across funds but not over time, and

𝜖𝑖𝑡 is the idiosyncratic error term of the regression. If the variables 𝐻𝑀𝑖𝑡, 𝑇𝐸𝑅𝑖𝑡, 𝐿𝑁 𝐴𝑠𝑠𝑒𝑡𝑠𝑖𝑡, 𝑇𝐸𝑅𝑖𝑡 and 𝐿𝑁 𝐴𝑔𝑒𝑖𝑡 of the random effects model (22) are in similar sign and significance to that of the cross-sectional model (17), the presence of a Simpson’s Paradox is unlikely and the robustness test provides evidence in favour of the validity of the regression results.

4. Data and Descriptive Statistics

This study examines the performance of European mutual funds over the time period 2009 to 2018. Where some studies take longer windows (e.g., Daniel et al., 1997; Wermers, 2000) and others take shorter windows (e.g., Otten and Bams, 2002; Romacho and Cortez, 2006), I choose to take a medium window of 10 year, referable to a business cycle. Most studies on mutual fund performance are from the early 2000’s or before. The intention of this study is to assess recent performance.

(15)

within the same period that includes the (weaker) failed mutual funds. Carhart et al. (2002) argue that survivorship bias significantly affects the outcomes of studies on mutual funds’ performance. They find that the annual bias ranges from 0.07% for one-year samples to 1% for samples of more than 15 years. To prevent survivorship bias, the data includes all mutual funds that are in existence, came to existence and went out of existence during the sample period.

The Thomas Reuters Eikon Fund Screener is used to include active, liquidated and merged funds in the asset universe “mutual fund”. In line with other literature (e.g., Bello and Janjigian, 2003; Fama and French, 2010), index funds and exchange traded funds are thereby excluded because they track an underlying index, which is considered as passive management. Furthermore, only in Europe domiciled funds that invest in European equity are included, thereby matching the European value-weight factor portfolios extracted from the Kenneth R. French Data Library. By entering these criteria, a list of 2200 mutual funds is received. The data is refined by only including funds that reported at least thirteen consecutive net asset values within the time period. So that for each mutual fund at least 12 consecutive returns are calculated, which is in line with Fama and French (1996) and Carhart (1997). Table 1 provides summary statistics of the remaining 1690 funds used for this study. Cleary, most funds are qualified as UCITS and the majority of European mutual funds are domiciled in Northern Europe. Also, the expense ratios and total net assets differ quite a lot among mutual funds. Where the average total net asset value is small compared to United States’ mutual funds (see, Otten and Schweitzer, 2002). Furthermore, notice that on average dead funds take up a large part of this survivorship bias free sample.

Table 1

Summary Statistics Mutual Fund Database

The table reports time-series averages and medians from 2009 to 2018 and cross-sectional averages at the end of the sample, 31 December 2018. NAV is net asset value. Exp ratio is all management fees and operating costs divided by average TNA. TNA is total net assets. Live funds are in operation at the end of the sample and dead funds discontinued operations prior to this date.

Period 2009 - 2018 End Of 2018

Med NAV Avg Exp Ratio Avg TNA Avg Age Group Total Number Avg Number ($) (%/year) ($ millions) (years)

(16)

Belgium 95 54.2 334.3 1.22 54.0 10.3 Finland 86 56.4 62.5 1.19 210.2 10.0 Switzerland 80 53.1 612.7 0.46 225.7 8.6 Spain 75 38.1 11.1 1.69 82.9 7.8 Denmark 32 27.8 17.1 1.36 63.8 15.5 Italy 45 24.6 8.4 1.81 183.4 14.6 Sweden 44 23.7 18.9 1.01 392.3 10.1 Netherlands 40 25.5 30.4 0.50 259.2 9.8 Norway 17 10.4 116.2 0.97 205.4 8.3 Portugal 10 9.4 10.9 1.99 29.5 21.0 Slovenia 4 4.0 9.1 NA 54.4 18.9 Hungary 5 2.5 1.5 NA 12.0 6.0 Poland 5 2.8 28.3 NA 15.2 7.0 Greece 3 2.1 9.4 3.06 16.6 10.7 Latvia 1 1.0 49.1 2.26 3.5 14.4 Turkey 1 0.7 1.0 NA 14.0 7.7 By current status Live funds 1100 1033.5 125.9 Dead funds 590 656.5 124.1

Like mentioned in Section 3, at the end of each year 10 portfolios are formed based on their previous 12 months return, 2 portfolios are formed based on their identity and 1 portfolio includes all mutual funds. Table 2 provides summary statistics of these 13 portfolios and the number of funds that are used to calculate the portfolios each year. The calculated monthly returns differ a lot among the funds each year, up until a difference of 9% in 2011. Therefore, I expect to find a large difference in alphas between the funds. Furthermore, like mentioned in the introduction chapter, the amount of funds still increases from year to year. Finally, note that the last five years of the table represent the year by year data used for the regression of cross-sectional model (17).

Table 2

Summary Statistics Mutual Fund Portfolios

The table reports the number of funds used each year for the formation of portfolios 1 (high) to 10 (low), UCITS, AIFM and ALL. The total number of funds used is equal to the number of funds within the sample that are in operation and have at least 12 consecutive returns that year.

Total Average Median Lowest Highest

Year Number Monthly Return Monthly Return Monthly Return Monthly Return

2009 882 2.61% 2.61% -0.68% 6.00%

2010 917 0.46% 0.47% -8.06% 2.47%

2011 946 -1.25% -1.17% -8.70% 0.28%

2012 977 1.54% 1.55% -1.55% 5.86%

(17)

2014 979 -0.73% -0.72% -2.52% 2.70%

2015 986 -0.04% -0.04% -2.00% 1.57%

2016 990 -0.21% -0.20% -4.87% 1.50%

2017 1024 1.80% 1.82% -6.49% 4.64%

2018 1100 -1.72% -1.70% -6.56% 1.43%

The summary statistics of the combined time-series model to asses security selection and market timing abilities (8) are provided in Table 3. Like expected, the market timing factor HM and the excess return on the market Mkt-Rf are highly correlated with each other. Furthermore, the monthly excess return and t-statistic of the HML factor is low when compared to the other factors of the model. In order to ensure the validity of the results of time-series regression model (8), several diagnostic tests have been conducted on the model and sample data. As shown by Appendix A, tests for autocorrelation, structural breaks, heteroskedasticity, non-stationarity and autoregressive conditional heteroscedasticity are conducted. The Chow (1960) test gives us some evidence of a structural break at august 2010, like mentioned in Section 3, a robustness test is therefore conducted to analyse its impact on the regression results. Furthermore, there is also evidence found of heteroscedasticity and ARCH-effects, this is dealt with by reporting regression outputs with Newey-West (1987) heteroscedasticity and autocorrelation consistent t-statistics.

Table 3

Summary Statistics Time-series Model (8), January 2009 to December 2018

Mkt-Rf is the return on the region’s value-weight market portfolio minus the U.S. one-month T-bill rate. SMB and HML are the region’s Fama and French’s factor-mimicking portfolios for size and book-to-market equity. WML is the region’s Carhart’s factor-mimicking portfolio for lagged momentum. HM is the region’s Henriksson and Merton’s market timing factor.

Monthly

Factor Excess Std t-stat for Cross-Correlations

Portfolio Return Dev Mean = 0 Mkt-Rf SMB HML WML HM

Mkt-Rf 0.79 5.21 1.66 1.00

SMB 0.48 1.73 3.07 -0.25 1.00

HML 0.08 2.50 0.34 0.54 -0.12 1.00

WML 0.67 3.95 1.87 -0.46 0.02 -0.51 1.00

HM 1.64 2.90 6.20 -0.82 0.18 -0.43 0.27 1.00

(18)

in 2017 the proxy for the market portfolio outperformed the proxy for the risk-free rate every month. Moreover, the raw standard deviations of the variables are large, showing that the mutual funds differ a lot in their characteristics. Finally, it is interesting to see that the mean expense ratio falls per year, which might be driven by investors shifting to low expense funds. In order to ensure the validity of the results of cross-sectional model (17), diagnostic tests have been conducted on the model and sample data. As there is evidence found of heteroscedasticity, regression outputs are reported with White (1980) heteroscedasticity consistent t-statistics.

Table 4

Summary Statistics Cross-Sectional Model (17)

Raw Mean and Raw Std dev are calculated before taking the natural logarithm. HM is the fund’s market timing coefficient from equation (8). TER is the fund’s total expense ratio, calculated by dividing all management fees and operating costs by average TNA. LN Assets is the natural logarithm of the fund’s total net assets. LN Age is the natural logarithm of the fund’s age in days.

Raw Raw Cross-Correlations

Portfolio Mean Std dev HM TER LN Assets LN Age

2014

HM -0.03 0.42 1.00

TER 1.72 0.96 0.00 1.00

LN Assets $126.3M $286.3M 0.05 0.16 1.00

LN Age 9.8yrs 6.3yrs 0.08 0.11 0.27 1.00

2015

HM 0.12 0.54 1.00

TER 1.64 0.93 0.11 1.00

LN Assets $126.4M $169.1M 0.07 0.13 1.00

LN Age 10.2yrs 6.6yrs 0.08 0.18 0.26 1.00

2016

HM 0.13 0.48 1.00

TER 1.57 0.89 0.09 1.00

LN Assets $118.3M $259.0M 0.10 0.12 1.00

LN Age 10.4yrs 7.0yrs 0.04 0.24 0.24 1.00

2017

HM . . .

TER 1.52 0.86 . 1.00

LN Assets $145.3M $319.9M . 0.12 1.00

LN Age 10.7yrs 7.2yrs . 0.19 0.22 1.00

2018

HM 0.32 0.31 1.00

TER 1.48 0.82 0.08 1.00

(19)

LN Age 10.5yrs 7.6yrs 0.02 0.20 0.23 1.00

5. Results

5.1. Mutual fund security selection and market timing abilities

The aim of the first part of the study is to design a model that performs well in measuring both security selection and market timing abilities. Appendix B compares multiple factor performance models and two market timing performance models. The last column shows the combined model (8) of the best factor performance model and market timing performance model. Per column, the models are sorted from left to right on their adjusted 𝑅2 and loglikelihoods. The first column of the table shows that the Carhart four factor model (4) on average fits the data best, only for portfolios 2 to 4 the Fama and French five factor model (5) proves to be a better fit. The second column shows that the Henriksson and Merton market timing model (7) on average fits the data best, only for portfolios 7, 8 and 10 the Treynor and Mazuy market timing model (6) proves to be a better fit. When the best factor performance model (Carhart’s four factor model) is combined with the best market timing performance model (Henriksson and Merton’s market timing model) to the combined model (8), the combined model (8) has the highest loglikelihood and adjusted 𝑅2 across all portfolios.

Table 5 shows the results of a time-series regression of the combined model (8) over the time period 2009 to 2018. The table, however, only gives an aggregate impression of mutual fund performance. Therefore, Appendix C provides a summary of the time-series regression results when estimated for each fund individually. Consistent with the efficient market hypothesis, for portfolios 1 to 10 it is shown that even the top 10% mutual funds with the highest returns are not able to significantly outperform the market. The market timing factor seems to be significant and positive for most portfolios, especially those that have a relatively high monthly excess return. The results are supported by the individual estimation of Appendix C, that provides evidence for 993 funds (out of 1690) underperforming the market and 271 mutual funds exhibiting positive market timing abilities. Furthermore, there doesn’t seem to be much of a difference between the portfolios with UCITS’s, AIFM’s and all funds. The only notable difference is that the AIFM portfolio seems to be less driven by smaller stocks (SMB) and last year’s winning stock (WML). From Table 5 and Appendix C it is further evident that in general mutual funds do not prefer certain stock characteristics, but more funds seem to prefer smaller stocks, lower book-to-market equity ratio stocks (HML) and last year’s winners.

Table 5

Time-series Regression Results for Combined Model (8) on Portfolios

(20)

lowest comprise decile 10. Mkt-Rf is the return on the region’s value-weight market portfolio minus the U.S. one-month T-bill rate. SMB and HML are the region’s Fama and French’s factor-mimicking portfolios for size and book-to-market equity. WML is the region’s Carhart’s factor-mimicking portfolio for lagged momentum. HM is the region’s Henriksson and Merton’s market timing factor. Alpha is the intercept of the model. Heteroskedasticity and autocorrelation consistent t-statistics are given in parentheses and followed by * for significance at the 10% level,** for significance at the 5% level and *** for significance at the 1% level.

Monthly

Excess Combined model (8)

Portfolio Return Alpha Mkt-Rf SMB HML WML HM

1 (high) 1.25% 0.001 1.051 0.168 -0.004 -0.088 0.163 (0.73) (30.90)*** (2.63)*** (-0.06) (-1.98)** (2.69)*** 2 0.82% -0.002 1.051 0.080 -0.019 -0.003 0.122 (-2.18)** (46.11)*** (1.76)* (-0.50) (-0.13) (3.14)*** 3 0.69% -0.003 1.034 0.056 -0.007 0.004 0.085 (-2.83)*** (47.08)*** (1.49) (-0.19) (0.19) (2.41)** 4 0.58% -0.004 1.030 0.031 -0.013 0.016 0.072 (-4.15)*** (53.02)*** (0.83) (-0.47) (1.53) (2.29)** 5 0.49% -0.005 1.028 0.015 -0.006 0.022 0.069 (-5.74)*** (55.59)*** (0.44) (-0.25) (2.47)** (2.56)** 6 0.40% -0.005 1.026 0.013 0.010 0.033 0.059 (-6.50)*** (58.21)*** (0.33) (0.48) (2.54)** (2.32)** 7 0.30% -0.006 1.014 0.011 -0.008 0.032 0.038 (-7.08)*** (55.66)*** (0.27) (-0.37) (1.86)* (1.56) 8 0.19% -0.007 1.012 0.029 -0.025 0.044 0.013 (-6.43)*** (35.24)*** (0.57) (-0.88) (1.61) (0.37) 9 0.03% -0.009 1.021 0.085 0.007 0.056 0.018 (-7.98)*** (39.24)*** (1.61) (0.22) (2.21)** (0.43) 10 (low) -0.46% -0.014 0.988 0.196 0.038 0.126 0.009 (-9.19)*** (26.13)*** (2.60)** (0.90) (2.63)*** (0.13) UCITS 0.43% -0.005 1.029 0.075 -0.007 0.025 0.066 (-6.35)*** (54.70)*** (1.95)* (-0.32) (2.59)** (2.46)** AIFM 0.41% -0.005 1.009 0.038 0.022 0.028 0.067 (-6.87)*** (50.94)*** (0.99) (1.11) (1.60) (2.51)** ALL 0.42% -0.005 1.025 0.068 -0.002 0.025 0.066 (-6.56)*** (54.29)*** (1.78)* (-0.08) (2.42)** (2.52)**

(21)

worse than the European mutual funds within the sample of this study. Where the majority of alphas in Table 5 are above -0.1%, the alphas reported by, for example, Grinblatt and Titman (1989), Wermers (2000) and Fama and French (2010) fluctuate around -1%. Furthermore, the preference in stock characteristics differs from sample to sample. However, the finding that mutual funds generally do not prefer certain stock characteristics but that more funds seem to prefer smaller stocks, lower book-to-market equity ratio stock and last year’s winners is in line with several studies on the United States’ mutual fund market (e.g., Gruber, 1996; Carhart, 1997 among others). Finally, where this study reports significant positive market timing abilities for European mutual funds, most United States studies found no mutual fund managers that could time the market or more mutual fund managers that miss timed the market. Only Bello and Janjigian (1997) found positive market-timing abilities, for United States’ aggressive, small-company and growth investment objective funds.

5.1.1. Robustness tests of the combined time-series model (8)

The results of the Fama and MacBeth (1973) validity test are shown by Appendix D. The intercepts, 𝛾0 in equations (12) and (13), of the five-factor model and combined model (8)

are both insignificant, thus not rejecting the null-hypothesis that the intercept is zero. Therefore, the Fama and French five-factor model (5) and combined model (8) capture the most variation in average excess returns and pass the main part of the validity test. The betas, 𝛾1 in equations (12) and (13), of all factors but Mkt-Rf are significant across all equations. Which implies that including those factors has more explanatory power in explaining average excess returns than omitting them. Mkt-Rf, however, becomes insignificant when other risk-measures are included. Hence, Fama and Macbeth (1973) argue that there is qualitative but not quantitative support for the market factor, as the slope is not of the appropriate size.

(22)

Finally, a dummy variable significance test has been conducted to examine the impact of the measured structural break during the Greek financial crisis at August 2010. The last paragraph of Section 4 discusses the results of the parameter stability test given by Appendix A. The output of the combined regression including the dummy variable (16) is reported by Appendix F. If we compare Appendix F to the original time-series regression output of Table 5, we can see that the dummy variable (D) is insignificant across all portfolios and the other factors of the portfolios are practically unaffected in sign and significance by the inclusion of the variable. Therefore, I conclude that the measured structural break during the Greek financial crisis does not materially affect the regression output and does not cause the parameters to become unstable.

(23)

Table 6

Cross-Sectional Regression Results for Model (17)

HM is the fund’s market timing coefficient from the combined time-series model (8). TER is the fund’s total expense ratio, calculated by dividing all management fees and operating costs by average TNA. LN Assets is the natural logarithm of the fund’s total net assets. LN Age is the natural logarithm of the fund’s age in days. N is the number of funds included in the regression and is equal to the number of funds within the sample that are in operation and have at least 12 consecutive returns that year. Heteroskedasticity consistent t-statistics are given in parentheses and followed by a * for significance at the 10% level, a ** for significance at the 5% level and a *** for significance at the 1% level.

Year Constant HM TER LN Assets LN Age Adj 𝑅2 N

2014 -2.60 -7.67 -1.12 0.12 -0.17 0.483 979 (-2.12)** (-11.49)*** (-10.40)*** (2.09)** (-1.17) 2015 -4.51 -24.28 -0.86 0.17 0.09 0.852 986 (-1.72)* (-40.25)*** (-4.12)*** (1.64)* (0.32) 2016 -4.88 -22.21 -1.46 0.14 -0.08 0.581 990 (-1.30) (-7.17)*** (-3.99)*** (1.39) (-0.16) 2017 -1.48 . -0.66 0.17 -0.35 0.030 1024 (-0.66) . (-0.93) (0.84) (-0.61) 2018 -6.00 -14.80 -1.85 -0.13 0.46 0.484 1100 (-3.91)*** (-18.06)*** (-9.05)*** (-2.17)** (2.26)**

Where there is no literature on the impact of a fund’s market timing ability on adjusted performance, some studies have analysed the relation of a fund’s total expense ratio, total net assets and age with its adjusted performance. The strong negative relation of a European mutual fund’s total expense ratio with its adjusted performance is also found for United States’ mutual funds (e.g., Carhart 1995; Malkiel, 1995; Petajisto, 2013). The measured impact of a European fund’s total net assets and age on its adjusted performance is low and differs in significance and sign from year to year. United States’ mutual fund performance studies support the weak impact of both characteristics on adjusted fund performance but are divided in significance and sign. Where Golec (1996) finds a negative impact for both characteristics at the 10% significance level, Prather, Bertin and Henker (2004) do not find any significant relation at all. Guercio and Reuter (2014) only study the impact of a fund’s total net assets on its performance and find a negative relation at the 1% significance level.

5.2.1. Robustness tests of cross-sectional model (17)

(24)

when the market timing variable is changed. Also, there is no material difference between the significances of the variables of the models. Therefore, the robustness test provides evidence in favour of the validity of the regression results. Furthermore, notice that the inclusion of a timing factor for 2017, does not change the signs or significances of the other variables and that the market timing variable itself is insignificant.

Secondly, the validity of the regression results is tested by removing the market timing factor from sectional model (17), which translates to a regression in the form of cross-sectional model (21) where only fund characteristics are regressed on adjusted fund performance. The results of this robustness test are shown by Appendix H. If we compare the regression results of model (17) including the market timing variable shown in Table 6 to those of model (21) without the market timing variable shown in Appendix H, we can see that there are a few sign changes across the variables throughout the years. Also, for 2015 the significance of the variable TER disappears when the market timing variable is excluded and there are a few changes in significance across the variables throughout the years. Although the main conclusions, like the strong negative relation between adjusted fund performance and TER, do not change, it can be concluded that the regression results are sensitive to the exclusion of the market timing variable.

Finally, the validity of the regression results is tested by conducting a panel regression (22) over the time period. The results of this robustness test are shown by Appendix I. If we compare the regression results of cross-sectional model (17) shown in Table 6 to those of the random effects model (22) shown in in Appendix I, we can see that the signs and significance levels of the variables of the panel regression correspond to the aggregate view of the cross-sectional regression results. The presence of a Simpson’s Paradox as described in the Section 3 is therefore unlikely. This is in particular an important result for the market timing variable (HM) as the described counterintuitive strong negative relation between the market timing variable and adjusted fund performance raised the suspicion of the paradox. Furthermore, although the panel regression results are only applicable to funds that exist throughout the entire period, notice the aggregate impact of the variables on adjusted fund performance. In contrast to the mixed results of Table 6, the positive and significant impact of a fund’s total net assets (LN Assets) in Appendix I provides evidence for the existence of economies of scale within the European mutual fund market.

6. Conclusion

(25)

selection and market timing abilities to a model that performs well in measuring both skills. Thereafter, to examine the attributes to the funds’ performance, I regressed a cross-sectional model that allows us to assess the impact of measured timing abilities and fund characteristics on adjusted performance. The data includes 1690 mutual fund that are domiciled in Europe and invest in European equity. Both active, merged and dead funds are included, thereby avoiding any form of survivorship bias.

In order to examine whether the funds are able to outperform and time the market, I come up with a model that performs well in measuring both security selection and market timing abilities. Several traditional factor and market timing models are therefore compared by their adjusted 𝑅2 and loglikelihoods. For this sample, Carhart’s four factor model (1997) performs best in measuring performance and Henriksson and Merton’s market timing model (1981) performs best in measuring market timing abilities. The combined model outperforms all models considered in explaining European mutual fund performance and is therefore regressed on a time-series of mutual fund returns between 2009 and 2018. Consistent with the efficient market hypothesis, the results show that the majority of mutual funds (939 out of 1690) significantly underperform the market and that only 2 funds were able to outperform the market within the timeframe. This is in line with existing literature for both the United States (e.g., Daniel et al., 1997; Wermers, 2000; Cremers and Petajisto, 2002) and Europe (e.g., Romacho and Cortez, 2006; Vidal-Garcia, 2013; Andreu, Matallín-Sáez and Sarto, 2018). Furthermore, for 285 out of 1690 mutual funds significant market timing abilities are found, whereof a great majority of 271 funds show positive significant market timing abilities. Most literature on market timing abilities of mutual funds indeed finds that on average mutual funds do not time the market. However, significant positive market timing abilities seem to be present for some samples of United States’ mutual funds (e.g., Bello and Janjigian, 1997) but has not yet been found for European mutual funds. In line with mutual fund performance studies from the United States (e.g., Gruber, 1996; Carhart, 1997) and Europe (e.g., Otten and Bams, 2002) it is further evident that the funds in general do not prefer certain stock characteristics, but more funds seem to prefer smaller stocks, lower book-to-market equity ratio stocks and last year’s winning stock.

(26)

In order to examine the attributes to the funds’ performance, its alphas and market timing factors of the combined time-series regression and the funds’ expense ratios, total net assets’ and ages are collected. Thereafter a cross-sectional regression is run for the years 2014 to 2018. The results show a strong negative relation between the market timing variable and adjusted fund performance. Which indicates that funds that are bad in market timing perform better than funds that are good in market timing. Treynor and Mazuy (1966) and Henriksson and Merton (1981), however, argue that a portfolio manager that is good in timing the market generates higher returns. This relationship is therefore in contrast to what we would expect, and no financial literature is found that substantiates a negative relation between the two. Furthermore, in line with literature from both the United States (e.g., Carhart 1995; Malkiel, 1995; Petajisto, 2013) and Europe (e.g., Otten and Bams, 2002), the test results show that funds with low expense ratios perform better than funds with high expense ratios. Moreover, some evidence is found for a relation between a fund’s total net assets and adjusted performance and between a fund’s age and adjusted performance. However, in contrast to the market timing variable and expense ratio variable, the relations are relatively weak, small and change in sign and significance from year to year. United States’ mutual fund performance studies support the weak impact of the characteristics on adjusted fund performance but are divided in significance and sign (Golec, 1996; Prather, Bertin and Henker, 2004; Guercio and Reuter, 2014).

Several robustness tests are applied to test the validity of the cross-sectional impact model. First of all, the validity of the regression results is tested by replacing the Henriksson and Merton (1981) market timing factor with the Treynor and Mazuy (1961) market timing factor. As there are no sign changes or material differences between the significances of the variables when the market timing variable is changed, the robustness test provides evidence in favour of the validity of the regression results. Secondly, the validity of the regression results is tested by removing the market timing variable and thereby only regressing fund characteristics on adjusted fund performance. Both some changes in sign and significance are observed. Although the main inferences of the limited model do not change, the regression results of the cross-sectional impact model seem to be sensitive to the exclusion of the market timing variable. Finally, the validity of the regression results is tested by conducting a panel regression over the time period. As the signs and significance levels of the variables of the panel regression correspond to the aggregate view of the cross-sectional regression results, a Simpson’s Paradox is found unlikely and the robustness test provides evidence in favour of the validity of the regression results

(27)

fluctuate around -1%. Furthermore, where this study reports significant positive market timing abilities for European mutual funds, most United States studies found no mutual fund managers that could time the market or more mutual fund managers that miss timed the market. Differences in regulation might partly explain these inequalities. In contrast to United States’ mutual funds, European mutual funds are, first of all, not required to distribute their realized capital gains to their investors every year and can thereby increase their net asset values. Furthermore, European mutual fund managers have greater leeway in the investment tactics they can apply, because they are allowed to engage in complex capital structures and significant short positions and do not have to diversify as much as United States’ mutual funds.

(28)

Appendix

A. Diagnostic Test Results of Time Series Regression Model (8)

The table reports the outcomes of several diagnostic tests on the time-series regression of model (8). Breusch-Godfrey test (1978) checks for autocorrelation. Chow test (1960) checks for parameter instability, UPB stands for unknown breakpoint and UBP stands for known breakpoint (august 2010). Breusch-Pagan test (1979) checks for heteroscedasticity. Dickey-Fuller test (1979) checks for non-stationarity. Engle’s (1982) ARCH-LM checks for autoregressive conditional heteroscedasticity, reported are the lags up until where ARCH-effects are found at the 5% significance level. The test statistics are followed by a * for significance at the 10% level, a ** for significance at the 5% level and a *** for significance at the 1% level.

Breusch–

Godfrey Chow UBP Chow KBP

(29)

B. Comparison of Performance Measures

CAPM is Jensen’s transformation of the Capital Asset Pricing Model of equation (2), FF3 is Fama and French’s three factor model of equation (3), FF5 is Fama and French’s five factor model of equation (5), CH4 is Carhart’s four factor model of equation (4), TM is Treynor and Mazuy’s market timing model of equation (6), and HM is Henriksson and Merton’s market timing model of equation (7). Log L is the loglikelihood measure and adj 𝑅2 is the adjusted 𝑅2of the time-series regressions on 10 portfolios based on their return, two portfolios based on their characteristics and

one portfolio including all funds. The models are sorted from left to right on their average adj 𝑅2/log L. A # means that 2 times the difference in

loglikelihood between the model and the model directly to the left exceeds 3.84, the 5% critical value of a chi-squared distribution. A # is added for equation (8) and compares equation (8) its loglikelihood ratio to that of CH4.

Factor Performance Models Market Timing Performance Models Both

CAPM CAPM FF3 FF3 FF5 FF5 CH4 CH4 TM TM HM HM CM (8) CM (8)

(30)

C. Time-series Regression Results for Combined Model (8) at Individual Fund Level

Mkt-Rf is the return on the region’s value-weight market portfolio minus the U.S. one-month T-bill rate. SMB and HML are the region’s Fama and French’s factor-mimicking portfolios for size and book-to-market equity. WML is the region’s Carhart’s factor-mimicking portfolio for lagged momentum. HM is the region’s Henriksson and Merton’s market timing factor. Alpha is the intercept of the model. Negative denotes a negative coefficient and Positive denotes a positive coefficient. The given numbers refer to the amount of mutual funds.

By significance level

at 10% at 10% at 5% at 5% at 1% at 1%

Coefficient Negative Positive Negative Positive Negative Positive Negative Positive

(31)

D. Summary Statistics Fama and MacBeth’s (1983) Validity Test

The table reports outcomes of the Fama and MacBeth two-step validity test on return sorted portfolios 1 (high) tot 10 (low) over the sample period January 2009 to December 2018. CAPM is Jensen’s transformation of the Capital Asset Pricing Model of equation (2), FF3 is Fama and French’s three factor model of equation (3), FF5 is Fama and French’s five factor model of equation (5), CH4 is Carhart’s four factor model of equation (4), TM is Treynor and Mazuy’s market timing model of equation (6), and HM is Henriksson and Merton’s market timing model of equation (7). Alpha is the intercept of the model. Heteroskedasticity and autocorrelation consistent t-statistics are given in parentheses and followed by a * for significance at the 10% level, a ** for significance at the 5% level and a *** for significance at the 1% level.

Model Alpha Mkt-Rf SMB HML WML RMW CMA TM HM

(32)

E. Summary Statistics Gibbons, Ross and Shanken’s (1989) Test

The table reports outcomes of the GRS validity test on return sorted portfolios 1 (high) tot 10 (low) over the sample period January 2009 to December 2018. CAPM is Jensen’s transformation of the Capital Asset Pricing Model of equation (2), FF3 is Fama and French’s three factor model of equation (3), FF5 is Fama and French’s five factor model of equation (5), CH4 is Carhart’s four factor model of equation (4), TM is Treynor and Mazuy’s market timing model of equation (6), and HM is Henriksson and Merton’s market timing model of equation (7).

Model Mean alpha F-statistic P-value Mean adj 𝑅2

Referenties

GERELATEERDE DOCUMENTEN

In our study we find, for a sample of domestic and international funds, that fund performance (estimated as Fama and French alphas) is negatively related to fund size

The main goal of this research is to determine whether Dutch fund managers earn abnormal returns compared to what an investor could earn with a passive strategy mimicking a

During these periods Dutch mutual funds underperform the benchmark and sector funds have significant higher return than country funds.. Additionally, during sub period 2 sector

The small spread between alphas and the close to zero average indicates that the FF (1993) three factor model with an additional market timing coefficient,

In this section, I analyse the determinants of the performance gap (Table 3, panel b) of the long-term mutual fund categories controlling for various funds characteristics such as the

Using a sample of active European equity mutual funds, I find that the 40% most active funds earn significant excess returns in periods with low to medium levels of

Among funds with different investment objectives, almost all kinds of funds do not have superior stock selection and market timing abilities when using three

This study analyzes the integration of environmental, social and governance (ESG) factors into the investment process of conventional mutual equity funds.. I investigated this for