• No results found

Performance of SRI mutual funds: The effect of screening intensity on funds' performance

N/A
N/A
Protected

Academic year: 2021

Share "Performance of SRI mutual funds: The effect of screening intensity on funds' performance"

Copied!
35
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Performance of SRI mutual funds:

The effect of screening intensity on funds’ performance

Maciej Tarczon 10837810

University of Amsterdam Under supervision of Ms. Magdalena Jurgiel June 2020

Abstract

This paper aims to provide an overview of the performance of U.S. socially responsible

investment (SRI) funds and examine the relationship between the performance of such funds and the intensity of the screening procedures they apply. In order to do so, we study an unbalanced sample of 52 SRI U.S. domiciled mutual funds, investing both domestically and internationally. Our outcomes suggest that there is no relationship between the SRI funds’ performance and their screening intensity. We also analyse the categories of screens, concluding that social screens are the only ones that affect the performance of the funds, and the effect is negative. The last part of our analysis is a disintegration of the screening procedures and an analysis of the effects of individual screens on the performance of the SRI funds. The screens that prove to affect the funds’ performance are the community, diversity, conflict risk, executive pay, alcohol, tobacco, and shareholder engagement screens (all negatively). While there are certain limitations to our

study, such as the joint hypothesis problem, survivorship bias, look-ahead bias or simultaneous

causality bias, they do not wipe out the value of our work and present opportunities for future

research.

Keywords: Socially Responsible Investing (SRI), investment screens, screening intensity, CAPM-model, Fama-French 5-factor model, performance evaluation, asset management, mutual funds

(2)

This document is written by Maciej Tarczon who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Table of Contents

1. Introduction ... 4

2. Theoretical background ... 5

2.1. The effect of screening intensity on the SRI funds’ performance ... 5

2.1.1. Being good costs ... 5

2.1.2. Being good pays ... 6

2.1.3. Being good costs at first, pays after ... 6

2.2. The effect of screen types on the funds’ performance ... 7

2.3. The effect of investment style on the funds’ performance ... 7

3. Data ... 8 3.1. Fund data ... 8 3.2. Factor benchmark ... 9 3.3. Screens ... 9 3.3.1. Environmental ... 10 3.3.2. Social ... 10 3.3.3. Governance ... 10 3.3.4. Products ... 10

3.3.5. Other/Qualitative and Shareholder Engagement ... 11

3.4. Descriptive analysis ... 11

4. Methodology ... 12

4.1. Empirical research design ... 12

4.2. Testable hypotheses ... 15 5. Empirical results ... 16 5.1. OLS Assumptions ... 16 5.2. Alpha Analyses ... 17 5.3. Hypotheses Tests ... 17 6. Limitations ... 21 7. Conclusion ... 22 Reference list ... 23 Appendix ... 26

(4)

1. Introduction

Managing an investment fund is surely not an easy task. The manager is expected to generate abnormal returns, even though there is a vast number of competitors aiming at the same thing. The challenge does not get any easier when managers do not only focus on the returns, but also apply criteria to what to invest in. That is exactly what the socially responsible investment (henceforth SRI) funds are trying to do – on the one hand maximising shareholder wealth. On the other, supporting companies that do good for the society, the environment, and all its

stakeholders in general.

This paper will examine if, and if so – how, do the screening procedures affect the financial performance of the socially responsible investment funds. To do so, we analyse the relationship between the number and the type of investment criteria engaged by an SRI fund, and its financial performance. Renneboog, ter Horst and Zhang (2008b) define SRI funds as socially investment processes that incorporate social, environmental and ethical considerations in the decision making whether or not to invest in a given firm. These funds apply certain screens in order to select or exclude investment opportunities based mostly on ecological, corporate governance, ethical, or social criteria. The SRI investors can be divided into two main categories, based on what is the main goal of their investment – the wealth-maximisers, who aim at financial utility and positive risk-adjusted returns, and socially-responsible ones, whose main utility is of non-financial nature, derived from purely holding a sustainable asset (Renneboog, ter Horst, Zhang, 2008a). Therefore, one could assume the financial returns may not always be the primary concern of the investors.

The whole SRI industry, while relatively young, is a rapidly growing one, which makes it truly important and potentially valuable and fruitful to examine it and understand it better. Since 1990s it experienced a significant growth, which was likely linked to ethical consumerism (paying more for a product that is aligned with personal values). In 2005 the SRI market, just in the US, grew to $2.3 trillion (1200% growth as compared to a decade earlier), which amounted to approximately 10% of total assets under management back then (Renneboog, ter Horst, Zhang, 2008a). Since the industry is rapidly gaining on its market share, it is increasingly important for the institutional and retail investors to understand how and if the SRI procedures affect the funds’ performance. Knowing that there are different types of the investors, such information would help to reduce making costly, misinformed decisions.

This paper exceeds the previous literature on the topic in several aspects. Firstly, in order to measure the financial performance of the funds we utilize the Fama-French 5-factor model rather than a single factor CAPM or 4-factor Carhart model. Next, we also use a larger time-window and more recent data, as compared to older studies in the field. Thirdly, we not only examine the effect of screening intensity, as measured by the number of screens, on the funds’

(5)

performance, but we also go further by analysing the effects of screens by categories, and of individual screens. The question we will try to answer through empirical analysis is as follows:

Does the screening intensity affect the fund performance of the US SRI funds?

The main findings of the analyses conducted here are not aligned with the previous literature on the topic. We do not find a significant relationship (be it linear or curvilinear) between the screening intensity of the funds (using different measures of it) and their performance. When analysed by categories, only the social screen seems to affect the fund performance (negatively). We also examined whether or not transversal screens (which exclude firms across industries) have a different effect on the funds’ performance than the industrial screens (these that exclude the whole industry). Here, however, we also did not find any significant relationship, implying that both types have no effect on the funds’ performance. Lastly, looking at individual screens, the ones with significant (and negative) effects on the performance are community, diversity, conflict risk, executive pay, alcohol, tobacco, and

shareholder engagement.

The following parts of the paper are structured as follows. The next chapter provides an overview of what is already known about the SRI industry, the screens being utilised, and the performance of SRI funds, also linking these topics to each other. Then, we move to the empirical part, starting with the research sample. Here we explain which data we consider for this research and how we collected it. This chapter is followed by the research methodology. The results of the analyses will be presented and discussed in chapter five. Lastly, we discuss the limitations of this paper and potential opportunities for future research on the topic. The conclusion part at the end includes the summarized findings of our analyses and the main points made in the paper. 2. Theoretical background

2.1. The effect of screening intensity on the SRI funds’ performance 2.1.1. Being good costs

The literature on screening intensity presents three potential relationships between screening intensity and the performance of socially responsible investment funds. The first, negative one, argues that basing on the portfolio theory, as screening intensity increases, the problem of under-diversification gets more severe (Lee, Humphrey, Benson & Ahn, 2010). Consequently, portfolio managers could experience higher risk as they would not sufficiently diversify their portfolio. Similar conclusions were also reached by Barnett & Salomon (2006). Another argument brought to light is the higher operational costs of the companies SRI funds invest in, as compared to their conventional counterparts. That is, among others, due to the social standards that “sin stocks” do not have to adhere to. Also, because of the potential difficulty of establishing sin companies, the

(6)

surviving firms might experience higher monopolistic gains, thus higher returns. This view is supported by Fabozzi et al. (2008).

2.1.2. Being good pays

On the other hand, several studies highlight the potential positive effects that higher screening intensity might have on the SRI funds’ performance. First of all, Capelle-Blancard & Monjon (2012) argue that under-diversification is not really that problematic, since the marginal gains derived from diversification decrease rapidly as the number of stocks in the portfolio goes up. Therefore, excluding certain securities from one’s investment pool should not be too

detrimental, unless the correlation of these financial assets with the stock market index would be negative. Building upon that, as reported by Barnett and Salomon (2004), screening intensity might be an indicator of stronger stakeholder relationship, which would be of beneficial nature with respect to performance. Moreover, in the same study it is presented that as the number of screens goes up, SRIs tend to gain valuable goodwill, such as good reputation, customer relations, and employee relations. Such an additional intangible asset would add to the funds’ performance.

Another argument on this side presented by Lee, Humphrey, Benson, and Ahn (2010) would be that being more socially responsible can provide the funds with a competitive

advantage. That could happen as a result of e.g. higher entry barriers, becoming more unique and harder to substitute and adding “sustainability” argument to the bargaining power and rivalry. Authors also argue, that stakeholder-focused organisations would reduce firm-specific risk thanks to the reduction of exposure to demand from different interest groups, which is a good sign for the investors. Next, as presented by Barnett and Salomon (2004), higher intensity of screening might be a proxy for the effective management of a company, leading to potential better performance. Last but not least, the same study argues that including more screens into the SRI selection process might disclose valuable information not available to the investors of

conventional funds, which would also contribute to effective management and the resulting higher performance. All that being said, the main counterargument to the under-diversification problem is that while the investment pool of SRI funds might be smaller, it is richer, therefore brings more to the table after all (Lee, Humphrey, Benson, & Ahn, 2010).

2.1.3. Being good costs at first, pays after

The third, more complex, potential relationship between the two variables is of curvilinear nature. Several studies support this view, indicating that funds with 1) low-level screening intensity are still able to sufficiently diversify, thus do not experience lower

performance; 2) high-level screening intensity invest in a smaller, but a richer pool thanks to the positive stakeholder relationship, leading to a higher expected performance; 3) middle-level screening intensity suffer, since they cannot fully diversify anymore, and do not yet receive the

(7)

benefits of higher expected returns of their investments (Barnett and Salomon, 2006). Lee, Humphrey, Benson, and Ahn (2010) add to that by studying the screening intensity – total risk relationship, concluding that as screening intensity increases, the respective SRI funds experience a reduction of total risk. As further argued, since this change is not related to the idiosyncratic risk (sufficient diversification is still possible), funds with more screens seem to have lower systematic risk. They later show the relationship between the mentioned risk and screening intensity, concluding that while at the beginning managers, being aware of the higher risk of SRIs, choose larger funds with lower beta, they are not able to do so with more screens.

2.2. The effect of screen types on the funds’ performance

One factor, not yet mentioned, which might also affect the relationship of screening intensity and funds’ performance, is the heterogeneity of the potential effects of screen types used by a fund on its performance. An example of that would be, as explained by Renneboog, Ter Horst, and Zhang (2008b), that SRI funds’ returns do indeed tend to decrease with increasing screening intensity, but only so when the screens criteria are based on social and corporate governance grounds. It is not the case anymore when the screens are of ethical, sin or

environmental type. Strongly linked to that, while the industrial screens (those that exclude a whole industry) do tend to pull down the fund’s performance, the transversal ones (which exclude firms across industries) do not show this effect (Capelle-Blancard & Monjon, 2012). Therefore, it seems reasonable to assume that distinguishing several screen types when analysing the

relationship between screening intensity and funds’ performance might affect the results. As no studies in the field of screening intensity, to our knowledge, took it into account so far, that will be a contribution of this paper to the literature on the topic.

2.3. The effect of investment style on the funds’ performance

Arguments supporting international rather than domestic investments and the potential benefits derived from such an investment style were already mentioned by Levy and Sarnat (1970). They argued that by investing into countries with low correlation with the investor’s country, one can reduce the variance of the portfolio, and thus enhance its performance. Redman, Gullet and Manakyan (2000) come to similar conclusions, saying that, in general, diversification can bring potential benefits to the profits table of mutual funds investing domestically. On the other hand, Droms and Walker (1994) did not find evidence for significant outperformance of international funds (results also reached by Gallo & Swanson, 1995, and Cumby & Glen, 1990). However, Droms and Walker (1994) elaborate on that, saying that while internationally investing funds tend to just track the index benchmark, it does not mean it is an inefficient investment. On the contrary, international investments seem to be a fair opportunity that earns returns adequate to the risk level. Controlling for the investment style and comparing performance across the

(8)

internationally and domestically oriented funds will provide another contribution of this study to the literature on the topic.

Another valuable insight is brought up by Otten, Bauer and Koedijk (2002), who compare the performance of ethical and conventional funds, controlling for investment style. They did not arrive at significant differences, implying that the ethical funds do not underperform relative to their conventional counterparts. However, they did report certain country-specific results, worth mentioning here in the analysis of only US funds. Firstly, using the CAPM model, the domestic ethical US funds seem to significantly underperform, which is not the case for the international ethical US funds. What is more, while German and British ethical funds show significantly lower market risk as compared to their conventional equivalents, for the US ethical funds there was no significant difference regarding the market beta. Moreover, using the conditional multi-factor model, the US domestic ethical funds again underperform. Therefore, it does seem reasonable to assume that domestic ethical US funds might perform worse than the international equivalents. Now, while the research conducted by Otten, Bauer and Koedijk (2002) is relatively close to the focus of this study, it did not examine the screening intensity factor, treating the sustainability of the funds more as a dummy variable taking values of 0 or 1, nothing in between. In this regard, this study will extend the current knowledge on the topic by accounting for the heterogeneity of the screening intensity among the sustainable funds.

3. Data

3.1. Fund data

Since the whole category of SRIs within mutual funds is relatively young, the data on it is rather limited. Knowing that the U.S. capital market is one of the most developed in the world, this paper will examine the US-domiciled mutual funds only, investing in equity both

internationally and domestically. The sample here was determined by the United States Social Investment Forum (USSIF) as of 29th of May 2020. It lists all mutual funds that are applying at least one SRI screen while determining what to invest in. In order to limit the duplication of the collected data, where multiple share classes were available, the only one considered here was the A class (or individual shares where there was a choice between institutional and individual). Otherwise, we would have come across multiple entries with the same screens policies, just different cost structures, investment horizons and target investors. All of the funds from the sample are open-end. Moreover, to improve comparability and validity of the results, all funds taken into consideration had at least 3 years of returns history at the moment of data collection. All the above-mentioned filters result in a final sample of 52 SRI funds.

(9)

The next step was combining the sample data from the USSIF website with the returns data retrieved from the Factset database. The returns reported for the sake of this paper is an unbalanced sample of monthly returns, ranging from April 1987, to December 2019 (2020 omitted due to the potentially unusual behaviors during the COVID-19 pandemic outbreak). Unluckily, the sample considered here suffers from the survivorship bias, since there was no possibility of identifying dead or merged funds from the databases. That might possibly lead to upwardly biased results. However, as reported by Renneboog et al. (2007), the rate of funds dropping out of the sample in the SRI fund industry is rather low, thus making this potential bias rather insignificant. Despite all that, it should be taken into consideration. The control variables such as the funds’ age (in days), size, management fees and investment style were all collected once again from the USSIF. Here it is worth mentioning that this data might suffer from the look-ahead bias. Since we did not have access to historical data on these factors, the assumption had to be made that they were kept constant over the years. This will be discussed more in one of the next sections of the paper.

3.2. Factor benchmark

As the main model used for performance evaluation in this study is the Fama and French 5-factor model, and as the robustness check the CAPM, Fama and French 3-5-factor model, and its extension - the Carhart factor model, we had to download the risk factors associated with each model. The market risk premium for the U.S. SRI funds investing into domestic equity is estimated by the difference between the market portfolio (consisting of all NYSE, AMEX, and NASDAQ firms), minus the one-month U.S. T-bill. The small-minus-big, SMB, factor is an estimate for the returns difference between a small cap portfolio and a large cap portfolio. Following the same logic, high-minus-low, HML is the gap between a high and low book-to-market value portfolio, the momentum, MOM, factor is the difference between a portfolio consisting of 12-month winners and one of 12-month users, and the RMW (robust-minus-weak) factor refers to the difference in returns between firms with robust or weak profitability. Lastly, the CMA factor accounts for the returns’ differences among low and high investment firms (conservative/aggressive investment strategies). For the internationally oriented funds, the market risk premium equals the return on the developed markets value-weight market portfolio minus the one-month U.S. T-bill rate. The following factors are the same as for the domestically investing funds, only that the stocks taken into account are selected from the developed markets, not only the U.S. All of the mentioned data is easily accessible through the Kenneth R. French Data Library (French, 2020).

(10)

To finish the data collection process, all of the screens applied by each fund had to be retrieved. The source of that information was USSIF. They report 6 main categories of screens used by the funds, focused on environment, social, governance, products, other/qualitative, or shareholder engagement aspects. The data was obtained on the 31/03/2020. Once again, since there was no access to historical data on the screening policies, the look-ahead bias might be present. However, as reported by Geczy, Stambaugh, and Levin (2003) these policies are rather difficult to change, thus making the implicit assumption of constant policies seem reasonable. In the following subsections the types of screens will be further discussed.

3.3.1. Environmental

This category is subdivided into three different screens: climate/clean tech, pollution/toxics, and environment/other. The first one focuses on the risks and opportunities regarding the climate change, greenhouse emissions, or business devoted to developing environmentally sustainable technologies (e.g. clean energy generation). The pollution/toxics screen considers the extent of toxicity and pollution created by the products and operations of the firm, and their management and mitigation, such as recycling or water management. The last one is a general category devoted to any other environmental issues not mentioned in the previous categories.

3.3.2. Social

The social type consists of 5 separate screens. The first one, community development, focuses on providing affordable housing, fair consumer loans, or supporting small local businesses. Next, the diversity & EEO screen considers equal employment opportunities and diversity policies. The human rights screen regards the adherence to basic human rights by the companies within their internal operations as well as in the countries where they do business. Typically, the focus of this screen is the relation with the indigenous people. Labor relations accounts for the consideration of the companies’ employee relation programs, employee involvement, health and safety, retirement benefits and union relations. Lastly, the conflict risk screen means a complete or partial exclusion of businesses that operate in countries categorised as repressive regimes or sponsors of terrorism.

3.3.3. Governance

The governance screens are divided into two subcategories. Firstly, the board issues, consider directors’ independence, diversity, salary and responsiveness to shareholders. The second

subcategory, executive pay, concerns the executive pay policies, especially whether they are reasonable and in line with the long-term interest of shareholders and other stakeholders.

3.3.4. Products

The product screens almost uniformly result in an omission of certain product categories. Therefore, fund’s application of an alcohol, defense/weapons, gambling or tobacco screen results

(11)

in no investments flowing into companies fully or partly engaged in these industries (e.g. cigarettes or alcohol manufacturing, selling military weapons, or operating gambling interests). The other screen in the product category is animal welfare, which focuses on the companies’ approach to animal testing where it is not necessary, especially so when it includes harming the animals, and to the treatment of the animals raised or used for food.

3.3.5. Other/Qualitative and Shareholder Engagement

One of the last two screen categories reported by the USSIF, other/qualitative, is the consideration of other criteria related to the aforementioned categories (environmental, social, governance or products) that was not captured by the previously described screens. Last but not least, the shareholder engagement screen regards whether or not there is a private dialogue in place on the environment, social and governance issues with companies in the investment strategy portfolio.

3.4. Descriptive analysis

As already mentioned earlier, the final sample of this study is 52 SRI funds, investing both domestically and internationally. The oldest fund in the sample is the New Alternatives Fund A Shares (NALFX), with the inception date of September 1982. The fund with the largest amount of assets under management (AUM) is Calvert Equity Portfolio A (CSIEX), with $3,577.53 millions. On average, as reported in the Table I below, the SRI funds in our sample yield 0.38% monthly excess returns with a 4.70% standard deviation. Looking at the kurtosis and skewness measures, our data does not seem to be normally distributed. The kurtosis of 6.18, highly above 3 – the “normal” value, implies heavy tails (abnormally high frequency of the data points around the centre). The -1.08 skewness measure suggests a negatively skewed distribution, meaning that many funds may have experienced lower than average returns. Perhaps with a larger dataset we could have avoided this problem. Having said that, these do not invalidate our analyses. As reported by Peiró (1999) sample tests of distribution symmetry of financial returns are of little value because of the non-normality of such data. Lastly, on average the sampled funds use 13.81 screens (out of 17 possible), with a standard deviation of the screens equal to 3.07. The more detailed data can be found in Table II, with descriptive statistics reported for each fund individually.

Table I

SRI Descriptive Analysis

The dataset ranges from April 1987, to December 2019. The table presents the summary statistics of the returns of the 52 sampled U.S. domiciled SRI funds. The reported mean return and standard deviation are on a monthly basis.

Panel A. Returns Sample mean SRI

(12)

Monthly SD 4.70%

Kurtosis 6.18

Skewness -1.04

Panel B. Characteristics

Total AUM (in millions of $) 845.51

Fund Age (in days) 5079.31

Screening intensity (# of utilised

screens) 13.81

SD in intensity 3.07

4. Methodology

4.1. Empirical research design

First of all, we need to test the quality of the collected data and whether the assumptions of OLS hold. Thus, we begin with testing for the heteroscedasticity of the returns. To do that, we run the White’s test for heteroscedasticity. Here, the null hypothesis stats that data is

homoscedastic, and the alternative hypothesis implies that there is unrestricted heteroscedasticity in the data. Therefore, to get reliable results we are looking for low p-values in this test. Next, we evaluate the autocorrelation of the returns. That is done using the Breusch-Godfrey LM test for autocorrelation. The null hypothesis states that there is no serial correlation, thus low p-values are also strived for here.

Moving to the main part of the analysis, we test whether the screening intensity has an effect on the U.S. SRI international funds’ performance. With that goal in mind we first needed to estimate the funds’ performance. For the sake of this study, the main performance measure is the alpha derived from the 5-factor model, developed by Fama and French (2015). However, as a robustness check, the alphas were also estimated using the CAPM model, the Fama and French 3-factor model, and the Carhart 4-3-factor model.

Starting with the most basic model, CAPM, the following regression will be estimated: 𝑟!"− 𝑟#" = 𝛼$%&'+ 𝛽(,!(𝑟'"− 𝑟#") + 𝜀!," (1)

The left-hand part of the equation is the actually observed excess return of a given fund. The first term on the right-hand side, 𝛼$%&', is the Jensen’s alpha (1968). This intercept is an estimate of the return above the prediction of the CAPM model. It could also be a signal of mispricing or a compensation for an unaccounted-for risk factor. Here, this is the fund performance measure. 𝑟'"− 𝑟#" is the market risk premium, that is the excess return of a widely defined market over the risk free rate.

Following this, we run the more sophisticated analyses that evaluate the performance, namely the Fama-French 3-factor model, the Carhart 4-factor model, and the Fama-French 5-factor model. For this purpose, the following regressions will be estimated:

(13)

𝑟!"− 𝑟#" = 𝛼##*+ 𝛽',!(𝑟'"− 𝑟#") + 𝛽+',,!𝑟+',,"+ 𝛽-'.𝑟-'.,"+ 𝜀!," (2) (3-factor) 𝑟!" − 𝑟#"= 𝛼$/01/0"+ 𝛽',!(𝑟'"− 𝑟#") + 𝛽+',,!𝑟+',,"+ 𝛽-'.𝑟-'.,"+ 𝛽'2'𝑟" + 𝜀!," (3)

(4-factor)

𝑟!"− 𝑟#" = 𝛼##3+ 𝛽',!(𝑟'"− 𝑟#") + 𝛽+',,!𝑟+',,"+ 𝛽-'.𝑟-'.,"+ 𝛽4'5,!𝑟4'5,"+ 𝛽$'%,!𝑟$'%,"+ 𝜀!," (4) (5-factor)

Where 𝛼 is the alpha (measure of performance; respectively for the 3-, 4- and 5-factor models); 𝑟'"− 𝑟#" is a variable standing for the market risk premium; 𝑟+',," accounts for the risk premium due to differences in returns between small and large cap portfolios (in terms of market values); 𝑟-'.," stands for the high versus low book-to-market ratio portfolios; 𝑟4'5," is a factor that refers to the firms’ robust or weak profitability; 𝑟$'%,", a factor to account for potential returns

differences among low and high investment firms (conservative/aggressive investment strategies. The general goal is to identify the alphas and test whether they are significantly different from 0, assuming that the alphas can be treated as a relevant measure of the funds’ performance (more specifically, the skill of the funds’ managers).

The next step will be the regression of the calculated alphas on the funds’ screening intensity (the number of screens from 1 to 17, based on the USSIF, 2020) as well as screening intensity squared to account for a potential nonlinearity, and the additional control variables. Firstly, the age variable got selected as, according to Otten, Bauer and Koedijk (2002), there seems to be a learning effect in the U.S. ethical funds, meaning that while, at first, they tend to trail conventional funds, more recently the performance converged. Thus, the age of the fund seems to be, especially in the U.S., of high importance.

Secondly, the size of a mutual fund is added as it can affect the fund’s returns both positively and negatively. On the one hand, economies of scale could result in lower transaction costs and more information processing capabilities, leading to higher returns. On the other hand, with more and more assets on hands, the number of investment opportunities decreases (Jones & Wermers, 2011). Also, the transactions can get large enough to affect the equity prices, thus disabling the execution of certain orders (Barnett & Salomon, 2006). Moreover, more assets under management can mean less efficiency of the fund (Chen, Huang, Kubik, 2004). To account for the size effects, a natural logarithm of assets under management is included in the regression.

Next, the management fees are included in the model, since higher fees can mean that even if there is excess return, the investor does not receive it, resulting in the investor-relevant performance being lower. Treating it as a control variable was also utilized by Renneboog, Ter Horst and Zhang (2008b), however, to make sure we correctly account for the potential changes that management fees can cause, we also run the analyses using post-fee returns and the resulting

(14)

Lastly, there is a dummy variable for whether the fund invests internationally or

domestically only. This variable should capture both macro-economic benefits and risks coming from international investments. The screening analysis will be done using the following model:

𝛼$%&'/##3,!," = 𝛽7+ 𝛽8𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 + 𝛽9𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦9+ Σ:;𝛽

:𝐹𝑢𝑛𝑑 𝐶ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠!,"+ 𝜀!," (5)

Where, 𝛼$%&'/##3,!," is the monthly risk-adjusted alpha derived from the CAPM or Fama-French 5-factor model. Screening Intensity is the total number of screens applied by the fund ranging from 1 to 17. Screening Intensity squared is another measure of screening intensity used to account for the potential functional misspecification and nonlinearity of the screening-returns relationship. We hypothesize that there might be such a relationship based on the previous work by Barnett and Salomon (2006), as well as on the scatter plot created using the collected data (Figure 1 and Figure 2). While with the alphas derived from the Fama-French 5-factor model the relationship between the screening intensity and the fund performance seems to be rather linear and positive, using CAPM alphas we observe a potentially curvilinear relationship between the fund performance and screening intensity. Lastly, fund characteristics is a set of given fund’s age, size, management fees, and investment style.

Then, we also want to study the effects of screen categories on the financial performance of SRI funds. To do that, we divide screens into 6 main, aforementioned categories. The score of the fund for each category is a value between 0 and 1, calculated by dividing the number of screens from a given category applied by the fund by the total amount of screens in the category. The screening intensity from the previous equation is then replaced by the categories’ scores:

𝛼##3,!," = 𝛽7+ 𝛽8𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 + 𝛽9𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦9+ Σ!;𝛽

!𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐶𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑠𝑒𝑑!+ Σ:;𝛽:𝐹𝑢𝑛𝑑 𝐶ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠!,"+ 𝜀!," (6)

Furthermore, we want to examine the distinction between the potential relationships of transversal and industrial screens and the financial performance of the SRI funds. For this purpose, dummy variables got created depending on whether a fund employs a number of the transversal/industrial screens above the average of all funds. The thresholds equal to the overall averages were chosen since all of the examined funds employ at least one transversal as well as industrial screen. Thus, what we want to determine is whether those who extensively use these screens get different performance results. The resulting regression is:

𝛼##3,!,"= 𝛽7+ 𝛽8𝑆𝑐𝑟𝑒𝑒𝑛𝑖𝑛𝑔 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 + 𝛽9𝑇𝑟𝑎𝑛𝑠𝑣𝑒𝑟𝑠𝑎𝑙 𝑆𝑐𝑟𝑒𝑒𝑛𝑠 + 𝛽*𝐼𝑛𝑑𝑢𝑠𝑡𝑟𝑖𝑎𝑙 𝑆𝑐𝑟𝑒𝑒𝑛𝑠 + Σ:;𝛽

(15)

Lastly, we want to study the effect of individual screens on the financial performance of the SRI funds. This is done by creating a dummy variable for each of the screens. Then, the dummies are regressed in a similar way to the before mentioned regressions:

𝛼##3,!," = 𝛽7+ Σ:;𝛽

!𝑆𝑐𝑟𝑒𝑒𝑛<,!+ Σ:;𝛽:𝐹𝑢𝑛𝑑 𝐶ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠!,"+ 𝜀!," (8) Where, generally the model is the same as before, only that now the first righthand term is a dummy variable which takes on a value of 1 if fund i applies screen k.

4.2. Testable hypotheses

The first hypothesis, based on the scatter plot (Figure II), and the arguments of intangible value coming from screening intensity is:

H1. Higher screening intensity leads to higher U.S. SRI funds’ performance.

That will be tested with regression number (5). The null hypothesis implies that screening intensity does not affect the U.S. SRI’s performance, and therefore 𝐻7: 𝛽8 = 0. The alternative hypothesis states that the coefficient is significantly different from zero and is a negative value; 𝐻8: 𝛽8> 0.

Basing on the arguments presented by Barnett and Salomon, (2006), and by Lee, Humphrey, Benson, and Ahn (2010), this relationship would not be, however, linear, but rather would change as the screening intensity increases. The second hypothesis taking that into consideration would therefore be:

H2. The relationship between screening intensity and the U.S. SRI funds’ performance is curvilinear.

Tested with regression number (5) as well. Here, the null hypothesis is that the screening

intensity and screening intensity squared do not differ regarding the direction of the effect on the

performance. Thus, the betas of both measures should be of the same sign; 𝐻7: 𝛽8∗ 𝛽9 > 0. The alternative hypothesis is therefore 𝐻8: 𝛽8∗ 𝛽9< 0.

Next, based on the analysis conducted by ter Horst, Zhang and Renneboog (2007), we expect all of the screens categories to negatively affect funds’ performance, except for the

environmental screens. Therefore:

H3. Except for the environmental screens, all screening categories have a negative effect on the funds’ performance.

(16)

This hypothesis will be tested with regression (6). The null hypothesis is 𝐻7: 𝛽=>?!0@>(=>"/A = 𝛽B@C!/A = 𝛽D@?=0>/>C= = 𝛽E0@FGC"B= 𝛽@"1=0/HG/A!"/"!?= = 𝛽B1/0=1@AF=0 =>D/D=(=>" = 0, and the alternative 𝐻8: 𝛽=>?!0@>(=>"/A > 0 ∩ 𝛽B@C!/A = 𝛽D@?=0>/>C= = 𝛽E0@FGC"B = 𝛽@"1=0/HG/A!"/"!?= = 𝛽B1/0=1@AF=0 =>D/D=(=>" < 0.

What is more, since there is evidence in the previous literature that the effect of screens on the funds’ performance differs depending on whether they exclude the whole industries or companies across industries, it would be insightful to examine whether distinguishing among industrial and transversal screens would change the examined relationship. Thus, the third hypothesis (in two parts) is:

H4 A) Transversal screens do not have any impact on the performance of the SRI funds. H4 B) Industrial screens negatively affect the performance of the SRI funds

Tested with regression number (7). The null hypothesis for H3 A) is that the transversal screens coefficient is not statistically different from zero, and thus 𝐻7: 𝛽J0/>B?=0B/A= 0. The null hypothesis for H3 B) states that the industrial screens beta does not differ from zero (𝐻7: 𝛽!>FGB"0!/A = 0). The alternative hypothesis for H3 A) implies that the coefficient of

transversal screens does differ from zero (𝐻7: 𝛽J0/>B?=0B/A≠ 0), and the one for H3 B) states that the industrial screens coefficient is <0 (𝐻7: 𝛽K>FGB"0!/A < 0).

Lastly, we are going to look into the effects of specific screens on the funds’ performance. Since, to our knowledge, no previous research exists that studies such a relationship, we state the following as our last hypothesis:

H5. None of the screens affects funds’ performance on its own.

This will be tested with our last regression, number (8). The null hypothesis here is 𝐻7: Σ 𝛽!𝑆𝑐𝑟𝑒𝑒𝑛<,! = 0, and the alternative 𝐻8: Σ 𝛽!𝑆𝑐𝑟𝑒𝑒𝑛<,! ≠ 0.

In the following sections we will look into the results of the analysis and compare them with the results reported in the previous literature.

5. Empirical results 5.1. OLS Assumptions

We begin this section by discussing the linear regression assumptions and whether they hold with our data. After running the White’s test for heteroskedasticity we conclude that there are some funds that suffer from this problem (6 out of 52 in the sample, as visible in Table III). Next,

(17)

using the Breusch-Godfrey LM test for autocorrelation we observe that only 3 funds that suffer from it (Table IV). Therefore, to avoid getting biased results and arriving at incorrect

conclusions, we make use of the Newey West standard errors with lag equal to 1 when calculating the alphas as the performance measure for the next analyses. Lastly, while when looking at Table II we do observe some extreme minimum returns in our data, all the mean returns seem normal. Therefore, there do not seem to be any influential outliers in the data.

5.2. Alpha Analyses

Since the CAPM model is the simplest of the ones used in this paper, its discussion will begin this subsection. As we can see in Table V, using the CAPM model, none of the funds

significantly (at the 5% level) outperformed the market, whereas 36 funds underperformed it (5% significance). That would suggest that generally, assuming the CAPM model captures a variety of risks associated with investing into equities represented in these funds, quite a large share of the SRI funds generates negative alphas.

Using the more sophisticated models, we reach similar conclusions. Once again, as can be seen in Table V, no funds outperformed the market, while 34, 36, and 35 underperformed it at the 5% significance level, using respectively the Fama-French 3-factor model, the Carhart 4-factor model, and the Fama-French 5-factor model. When post-fee returns are used to estimate the Fama-French 5-factor model, we get 38 funds significantly (at 5% level) underperforming the market, and no funds outperforming it. Overall, assuming these models capture all the risks, there does seem to be therefore some evidence for SRIs underperforming the market.

5.3. Hypotheses Tests

We now turn to the main part of this analysis, namely determining the relationship between the screening intensity of the sampled funds and their financial performance. The analysis starts with the impact of screening intensity as measured by the total number of screens applied by the fund. As already mentioned before, the arguments are present on both sides. More screening can, on one hand, mean investing in a smaller but richer pool, valuable intangible assets, disclosure of useful information, better reputation and employee relations, more effective management and higher entry barriers. On the other hand, it is argued that SRI funds with more screens can be under-diversified due to the criteria intensity and may have higher operational costs and costly social standards to adhere to (lowering the financial performance).

In order to determine the relationship and test the first hypothesis, we run model (5). The results can be found in Tables VI and VII. Starting with the first column of Table VI, we can see that the beta coefficient of the screening intensity variable (screens) is hardly above zero (0.009) and not significant at the 5% level. After adding the control variables, the beta goes up by mere 0.006 (the new beta equals 0.014) and is still insignificant. Therefore, while we may not be able

(18)

to conclude that screening intensity significantly boosts the financial performance, it also does not seem to harm the returns. As a robustness check we also tested this relationship using the CAPM alphas (reported in Table VII), although reaching the same conclusions. To control for the management fees, we run the analysis also using post-fee alphas. That, however, did not change the derived conclusions. With these outcomes, we do not find support for our first hypothesis, stating that screening intensity leads to lower risk-adjusted performance of the SRI funds. The results we observe are not aligned with those reached by Hymphrey, Benson and Ahn (2010) and of Capelle-Blancard and Monjon (2012), who reported a negative relationship between the screening intensity and funds’ performance. The reason for that might lie in a different sample and different performance-evaluation models. Hymphrey, Benson and Ahn (2010) use the Carhart 4-factor model to measure performance, while we utilize the Fama-French 5-factor model. Also, they include funds investing not only in equity, whereas we focus on the equity-only funds. Capelle-Blancard and Monjon (2012), on the other hand, examine French, rather than U.S. domiciled SRI funds, which might have also impacted the conclusions of their study.

Now, as previously discussed, the relationship between the screening intensity and the risk-adjusted performance might also be curvilinear. To check whether that might be true, another term is added to the model, the square of screening intensity measure (screens squared). The results can be found again in Table VI. The first noteworthy finding is a decrease of the adjusted R-squared in the second model, from 0.056 to 0.054. Next, we do indeed observe different signs of the coefficients of the screens variable (-0.047), and of screens squared (0.003), however once again these betas are not significant at neither 5% nor 10% level. Furthermore, none of the control variables seems to be significantly affecting the performance of the funds. The same implications can be derived from Table VII which reports the same regression only with CAPM rather than FF5 alphas (the coefficients of screens and screens squared being then equal to, respectively, -0.075 and 0.003). Using post-fee alphas for FF5 analysis yields similar results. Therefore, we find no support for the second hypothesis of our paper, that the relationship

between the screening intensity and SRI funds’ performance is curvilinear. The above-mentioned results are to some extent inconsistent with the previous findings of Barnett and Salomon (2006), who reported a curvilinear relationship to be the case. That might be due to a different source of screening data (12 rather than 17 screens considered, sourced from The Social Investment Forum), a different sample (determined again by The Social Investment Forum; also, including funds investing in bonds), and a far shorter time period (years 1972-2000, as compared to 1978-2019 considered in this analysis).

An alternative measure of screening intensity introduced in this study is the screening

(19)

specific types of screens matters. The results of the model (6) regression are reported in Table VIII, using both pre- and post-fee FF5 alphas. As we can see, the only category that seems to be significantly affecting the financial performance of the funds is the social screens category, and only so for the post-fee alphas. Its coefficient of -0.731 implies that, with 5 social screens within the category, applying one more results in a 0.20 ∗ (−0.832) = −0.146 (per month, in %) change in the risk-adjusted performance of the fund. Therefore, more social screens translate to lower returns for the funds. Partly matching conclusions were reached by ter Horst, Zhang and Renneboog (2008b), who reported that funds’ returns decrease with screening intensity on social and corporate governance criteria. However, with no other variable with a p-value of <0.05, there seems to be little support coming from the model for the general negative effects of screening intensity on the funds’ performance. Having said all that, we do not find support for our third hypothesis.

To identify the potential relationship between the transversal screens (as opposed to

industrial) and the financial performance of the funds, we now run model number (7). As can be read from Table IX, for both pre- and post-fee analyses there does not seem to be a significant relationship between these two variables. Using the pre-fee FF5 alphas, the transversal beta coefficient of 0.060 is not significant, meaning that we cannot assume it affects the risk-adjusted performance in any way. The industrial coefficient of 0.095, while not significant, is also of an opposite sign to what was hypothesised (value over zero). The post-fee alphas regression leads to the same conclusions, with coefficients of transversal and industrial equal to, respectively, 0.099 and 0.124, both insignificant. With these results we cannot reject the null hypothesis of H4 A), meaning that H4 A) is actually supported, although we do not have sufficient statistical evidence to support H4 B) and its implied negative effect of industrial screens on the funds’ financial performance. Seemingly, industrial screens are not that harmful to the returns of the SRI funds. This result is not aligned with the previously reported relationship between the mentioned

variables by Capelle-Blancard and Monjon (2012). Firstly, it might be due to a different treatment of the variable accounting for the transversal or industrial screens. Capelle-Blancard and Monjon (2012) count the total number of transversal versus industrial screens, whereas we created a dummy variable depending on the number of transversal or industrial screens utilised by a given fund. Moreover, as previously mentioned, Capelle-Blancard and Monjon (2012) consider a different sample and a different performance evaluation model, both of which might have affected the results of their analyses.

Lastly, we estimate model number (8) to check for the effects of individual screens on the financial performance of the SRI funds. Due to collinearity issues, we had to exclude two of the screens, namely pollution and environment from the analysis without the screens and screens

(20)

squared variables. When these two were included in the regression, the two screens that were

omitted were labor relations and again pollution. From the second column of Table X, we can see that using the pre-fee alphas there are 5 individual screens significantly affecting the returns of the funds. These are community (at 10% significance), diversity (1% sig.), conflict risk (5% sig.), executive pay (10% sig.), and shareholder engagement (5% sig.). Furthermore, we also observe a significant at 10% effect of age on the performance, although its coefficient is

approximately zero, thus the effect is rather negligent. Running the same regression with the post-fee alphas, we get 7 screens with a significant relationship to the funds’ performance, namely

community (5% sig.), diversity (1% sig.), conflict risk (5% sig.), executive pay (5% sig.), alcohol

(10% sig.), tobacco (5% sig.), and shareholder engagement (10% sig.). Interestingly, all of the significant coefficients are negative.

The negative effects of tobacco and alcohol are understandable. These are industrial screens - exclude whole industries from the investment opportunities pool. The coefficient of community can also be explained by the focus on local opportunities, at the same time likely forgoing other, more lucrative but more spread out investments. Omitting investments related to conflict risk presents a similar issue – one gives up possibly attractive business opportunities in risky countries. Here, however, one could imagine that there are significant political costs related to investing in these countries, therefore avoiding them could be actually beneficial.

On the other hand, the coefficient of diversity seems shocking at first. The arguments relating diversity to firms’ performance presented in the existing literature are unified, supporting the view that diversity leads to higher performance. Allen, Dawson, Wheatley and White (2007) explain it by showing strong support from regression analysis that perceptions of employees regarding diversity at both the managerial and the non-manager level are positively linked to perceptions of organizational performance. Moreover, Miller and del Carmen Triana (2009) found a positive relationship between the firms’ reputation and innovation (mediators leading to higher performance) and the racial diversity of the companies’ boards. These arguments suggest that investing in diversity-supporting firms should be beneficial. It is therefore a puzzle why SRI funds that employ diversity screens should suffer in terms of financial returns.

We come upon a similar inconsistency with the shareholder engagement screen. According to Henisz, Dorobantu and Nartey (2013), increased shareholder engagement boosts the financial valuation of the firm. Thus, one could assume that funds investing in companies screening for higher shareholder engagement would experience higher returns, not lower as our findings would suggest. Lastly, the executive pay screen, intuitively, should also increase rather than decrease the performance of the firm. Paying the executives a reasonable amount of money, aligned with the goals of shareholders and other stakeholders should be generally beneficial for the firm. We,

(21)

however, observe a negative relationship between the mentioned screen and the performance of the firm.

As we can see, we do find some counterintuitive and unexpected coefficients in our analysis of the relationship between the funds’ performance and the individual SRI screens they apply. That already results in a rejection our fifth hypothesis. Perhaps the negative coefficients are caused by the costs of implementing the screens and of the screening process when deciding on what to invest in. Examining this further presents an interesting opportunity for future research.

Bringing together all of our results, the screening intensity does not, in general, affect the performance of the U.S. SRI funds. There is, however, one screening category that affects the returns. Also, looking at the individual screens, 7 out of 17 do have an effect on the financial performance of the funds. Thus, answering the question stated at the beginning of this paper, there is no unified argument saying that screening intensity does or does not affect the performance of the funds. The relationship is to some extent ambiguous and still needs to be studied.

6. Limitations

Sadly, the results reported above suffer from several limitations. First of all, in order to utilize the performance measuring models (CAPM, Carhart 4-factor, French 3-factor and Fama-French 5-factor), we do have to assume they measure the returns properly and reliably. If that is not the case, the observed abnormal returns can be a result of a market inefficiency, a

misestimation and inaccuracy of the model, or both. This problem is called the joint hypothesis

problem and makes testing for market efficiency difficult, if even possible.

Another potential limitation is the already described earlier survivorship bias. It arises when poorly performing funds cease operation or merge with other funds and drop out of the database. We then observe only the well-performing entities. However, as already mentioned, ter Horst, Zhang and Renneboog (2007) report that the number of SRI funds ceasing to exist is relatively small, making it unlikely to bias the results. It should, however, be remembered when analysing and generalizing the conclusions made here.

Next, there is the look-ahead bias, which we also could not avoid due to data limitation. We did not have the data on e.g. management fees, size, or screening policies over the years, making us assume these remained constant since the inception date. However, variables such as the these mentioned above are not likely to change as it is a difficult and troublesome procedure. That makes our assumption more rational, although it still remains a fruitful opportunity for future research, should this data be made available then.

(22)

The assumption of the betas (systematic risk) of the funds remaining constant over the years might also be problematic for this research. As argued by Kon & Jen (1978) this assumption proves unsatisfactory since managers change their positions over time, which changes the funds’ betas. The systematic risk is therefore more of a changing value rather than a constant.

What is also rather problematic is the collinearity of certain screens. We had to eliminate the

pollution and environment screens from the analysis due to this issue, and therefore cannot

conclude what is their potential effect on the performance of the funds. Collecting more data could help solve it. Perhaps in the future, once the SRI industry is more developed and more data on it becomes available, it will not be a problem anymore.

Also, the direction of the relationship between the screens and financial performance is not really determinable from our analysis. While a firm that acts on the environmental issues might have better financial performance thanks to e.g. better reputation, it can well be the case that this relationship works in the other direction – well performing firms have more unused cash to invest in the environment and becoming sustainable. That presents a simultaneous causality bias. Eliminating it leaves space for future research in the field.

Generally, the limitations presented above make the conclusions reached from the regressions less internally valid. However, it still adds value to the previously written literature on the topic. 7. Conclusion

This paper aims to discuss the socially responsible investments industry and examine the relationship between the screening intensity of the SRI funds and their performance. With this goal in mind, we collected a sample of 52 SRI funds, all U.S.-domiciled and investing in either domestic or international equity. The sample is unbalanced, ranging from April 1987 till

December 2020. Out of 52 funds, depending on the model and on whether pre- or post-fee alphas are utilised, 34-38 underperform the market and none outperforms it. This, however, only holds assuming that the models yield rational expectations for the appropriate returns.

The main value of this paper centres around the empirical analysis of the screening intensity of the SRI funds and their performance and decomposing the screening process. At this point, an assumption had to be made, that screens used by the funds remained the same since the inception date, introducing the look-ahead bias to our analysis. Our results are mostly not aligned with the previous literature on the topic.

First of all, the intensity of the screening procedure does not seem to affect the financial performance of the SRI funds. Unlike Hymphrey, Benson and Ahn (2010) and Capelle-Blancard and Monjon (2012), we did not find any significant relationship between the screening intensity and the financial performance of the examined funds. Our analyses lead to a conclusion that the

(23)

number of screens utilized by the funds does not lead to neither over- nor under-performance with regard to the market. We also did not find curvilinearity in the relationship, which was reported by Barnett and Salomon (2006).

Next, we divide the screens into categories. We find that the social category is the only one that affects the performance of the funds and does so negatively. Here the result is matched by what was reported by ter Horst, Zhang and Renneboog (2008b), that funds’ returns decrease with screening intensity on social and corporate governance criteria.

Furthermore, we run a separate regression to test for whether industrial screens have a different effect on the returns of the funds than the transversal screens. Yet again, we do not come across any significant results, implying that neither transversal nor industrial screens affect the performance of the SRI funds.

Lastly, we create dummy variables for each individual screen and examine the potential effects of individual screens used by the funds and their financial performance. The screens that affect (negatively) the performance of the funds are the community, diversity, conflict risk,

executive pay, alcohol, tobacco, and shareholder engagement screens. While the coefficients of

some of the screens seem reasonable, the effects of the other are counterintuitive and present an opportunity for future research.

Unfortunately, this research is also subject to several limitations. Firstly, we came upon the

joint hypothesis problem, meaning that we have to assume the models for evaluating funds’

performance measure it properly. Next, due to data limitation our sample suffers from a

survivorship bias and look-ahead bias. Luckily, both seem to be not all that problematic after all

when analysing SRI funds. Another assumption we had to make for the sake of our analyses is that the beta of the funds (systematic risk) remained constant over the years, while managers can alter the risk profile of the funds by changing their positions in certain assets. Also, since the SRI industry is relatively young, the database on the SRI industry is limited, which might be a

potential reason for collinearity issues regarding the individual screens in our analyses. Lastly, there is a possibility that some of the relationships run in an unexpected direction, inviting the

simultaneous causality bias into our research. All these limitations, while harmful for the internal

validity of our conclusions, do not eliminate their value, and present fruitful opportunities for future research on the topic.

Reference list

Allen, R., Dawson, G., Wheatley, K., & White, C. (2007). Perceived diversity and organizational performance. Employee Relations, 30(1), 20-33.

(24)

https://doi.org/10.1108/01425450810835392

Barnett, M. & Salomon, R. (2004). Unpacking Social Responsibility: The Curvilinear Relationship between Social and Financial Performance. SSRN Electronic Journal. Barnett, M., & Salomon, R. (2006). Beyond dichotomy: the curvilinear relationship between

social responsibility and financial performance. Strategic Management Journal, 27(11), 1101-1122. https://doi.org/10.1002/smj.557

Capelle-Blancard, G. & Monjon, S., 2012. The Performance of Socially Responsible Funds: Does the Screening Process Matter? European Financial Management, 20(3), 494-520. Chen, J., Hong, H., Huang, M., & Kubik, J. (2004). Does Fund Size Erode Mutual Fund

Performance? The Role of Liquidity and Organization. American Economic

Review, 94(5), 1276-1302. https://doi.org/10.1257/0002828043052277

Cumby, R. E. & Glen, J. D. (1990). Evaluating the Performance of International Mutual Funds.

The Journal of Finance, 45(2), 497-521.

Droms, W. G. & Walker, D. A. (1994). Investment performance of international mutual funds.

The Journal of Financial Research, 17(1), 1-14.

Fabozzi, F., Ma, K., & Oliphant, B. (2008). Sin Stock Returns. The Journal Of Portfolio

Management, 35(1), 82-94. https://doi.org/10.3905/jpm.2008.35.1.82

Fama, E. & French, K. (2015). A five-factor asset pricing model. Journal of Financial

Economics, 116(1), 1-22.

French, K. R. (2020). Kenneth R. French - Home Page. Mba.tuck.dartmouth.edu. Retrieved 13 June 2020, from http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/index.html. Geczy, C., Stambaugh, R., & Levin, D. (2003). Investing in Socially Responsible Mutual

Funds. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.416380

Gallo, J. G. & Swanson, P. E. (1995). Comparative measures of performance for U.S.-based international equity mutual funds. Journal of Banking & Finance, 20, 1635-1650. Henisz, W., Dorobantu, S., & Nartey, L. (2013). Spinning gold: The financial returns to

stakeholder engagement. Strategic Management Journal, 35(12), 1727-1748. https://doi.org/10.1002/smj.2180

Jensen, M. (1968). The Performance of Mutual Funds in the Period 1945-1964.

The Journal Of Finance, 23(2), 389-416. https://doi.org/10.1111/j.1540-

6261.1968.tb00815.x.

Jones, R., & Wermers, R. (2011). Active Management in Mostly Efficient Markets. Financial

Analysts Journal, 67(6), 29-45. https://doi.org/10.2469/faj.v67.n6.5

Kon, S., & Jen, F. (1978). Estimation of Time-Varying Systematic Risk and Performance for Mutual Fund Portfolios: An Application of Switching Regression. The Journal Of

(25)

Finance, 33(2), 457-475. https://doi.org/10.1111/j.1540-6261.1978.tb04861.x.

Lee, D., Humphrey, J., Benson, K. & Ahn, J. (2010). Socially responsible investment fund performance: the impact of screening intensity. Accounting & Finance, 50(2), 351-370. Levy, H. and Sarnat, M. (1970). International Diversification of Investment Portfolios. The

American Economic Review, 60(4), 668-675.

Miller, T., & del Carmen Triana, M. (2009). Demographic Diversity in the Boardroom: Mediators of the Board Diversity-Firm Performance Relationship. Journal Of

Management Studies, 46(5), 755-786. https://doi.org/10.1111/j.1467-6486.2009.00839.x

Otten, R., Bauer, R. & Koedijk, K. (2002). International Evidence on Ethical Mutual Fund Performance and Investment Style. SSRN Electronic Journal.

USSIF (2020). Mutual Fund Performance Chart | US SIF. Charts.ussif.org. Retrieved 29 April 20 2020, from https://charts.ussif.org/mfpc/?FundType=IG&.

Peiró, A. (1999). Skewness in financial returns. Journal Of Banking & Finance, 23(6), 847-862. https://doi.org/10.1016/s0378-4266(98)00119-8

Redman, A. L., Gullet, N. S. & Manakyan, H. (2000). The performance of global and

international mutual funds. Journal of Financial and Strategic Decisions, 13(1), 75-85. Renneboog, L., ter Horst, J. & Zhang, C. (2008a). Socially responsible investments: Institutional

aspects, performance, and investor behavior. Journal of Banking & Finance, 32(9), 1723- 1742.

Renneboog, L., ter Horst, J. & Zhang, C. (2008b). The Price of Ethics and Stakeholder

Governance: The Performance of Socially Responsible Mutual Funds. SSRN Electronic

Journal, 14, 302-322.

ter Horst, J., Zhang, C., & Renneboog, L. (2007). Socially Responsible Investments: Methodology, Risk Exposure and Performance. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.985267

(26)

Appendix

Table II Descriptive Statistics

The dataset ranges from April 1987, to December 2019. The table presents descriptive data for each individual fund, namely the number of observations, mean, standard deviation, minimum and maximum observed values, 1% and 99% percentiles, and skewness and kurtosis measures.

Variable Obs Mean Std.Dev. Min Max

APPLX 156 .21 4.161 -14.474 12.877 BCAIX 107 .267 3.929 -11.515 10.455 CFWAX 135 .375 4.919 -20.08 14.057 CGAEX 151 -.22 7.08 -37.003 15.952 COIIX 151 .185 5.194 -24.691 13.051 CVMAX 86 .482 4.084 -9.157 10.008 CWVGX 326 .018 4.652 -25.292 12.662 DOMAX 156 -.021 5.32 -22.97 13.309 ESGMX 45 .848 3.566 -10.386 7.039 GCINX 39 .449 3.153 -8.436 6.242 MASGX 56 .16 3.608 -9.411 6.998 MPLAX 107 .159 3.996 -12.376 10.845 NALFX 364 .022 6.062 -54.814 17.753 PORTX 224 .34 4.424 -17.792 11.247 PXEAX 79 .544 3.851 -10.353 9.201 PXINX 141 .055 5.014 -21.545 14.488 SEEFX 57 .561 3.137 -8.395 6.481 SRIFX 36 .497 3.494 -10.042 5.702 BCAMX 92 .66 3.385 -14.177 7.948 BVSIX 47 .662 4.447 -18.395 9.831 CFJAX 54 .426 3.831 -10.877 9.311 CGJAX 54 .906 3.872 -9.074 8.625 CSIEX 381 .209 4.422 -21.257 12.455 CSXAX 226 .35 4.482 -18.758 11.881 DSEPX 175 -.267 6.034 -34.03 12.683 GCEQX 267 .346 4.418 -15.56 10.453 MGNDX 151 .677 4.304 -17.451 10.549 MVIAX 221 .205 4.609 -18.491 12.361 NRAAX 126 .524 4.512 -17.991 9.921 PAXDX 36 .656 3.572 -11.003 6.651 PAXLX 36 .205 6.121 -27.735 8.322 PRBLX 320 .185 3.669 -15.487 12.103 PXGAX 79 .469 3.792 -14.994 8.212 REDWX 49 .563 5.295 -20.052 12.134 WSEFX 246 .32 3.885 -16.1 10.75 ARGFX 393 .228 5.566 -26.77 31.303 CAAPX 351 .274 5.216 -25.087 24.469 CAMFX 61 .35 3.541 -9.996 6.868 CCAFX 290 .166 5.563 -20.341 17.5 CCVAX 182 .334 5.18 -19.734 16.499 MMSCX 151 .026 5.957 -20.352 13.391 PARMX 176 .475 4.123 -18.302 13.372 PXSAX 79 .239 4.128 -21.992 7.97 TSMDX 52 .456 4.768 -16.482 10.145 WASMX 90 .657 4 -13.592 8.132 WASOX 134 .583 5.232 -20.391 16.094 CAAAX 173 .18 4.241 -18.904 10.963 CCLAX 176 .004 1.824 -9.24 4.174 CMAAX 176 .108 3.149 -14.51 7.942 MPLIX 108 .166 4.001 -12.335 10.93 PARWX 176 .551 4.959 -17.788 20.631

(27)

PXWEX 146 .125 4.708 -20.092 11.345

Table III

White’s test for heteroscedasticity

The table presents the outcomes of the White’s test for heteroscedasticity. The sample considered here is 52 selected SRI U.S. domiciled funds. The dataset ranges from April 1987, to December 2019. The test was added to the Fama-French 5-factor alpha regressions for each fund.

Variable Chi2 P-value

APPLX 10.41 0.9601 BCAIX 24.09 0.2384 CFWAX 9.42 0.9774 CGAEX 29.39 0.0804* COIIX 6.84 0.9972 CVMAX 17.03 0.6512 CWVGX 7.35 0.9954 DOMAX 12.45 0.8995 ESGMX 15.39 0.7535 GCINX 21.27 0.3812 MASGX 12.66 0.8914 MPLAX 13.41 0.8591 NALFX 79.98 0.0000 PORTX 7.92 0.9924 PXEAX 13.40 0.8596 PXINX 26.16 0.1605 SEEFX 26.29 0.1565 SRIFX 29.50 0.0784* BCAMX 31.63 0.0474** BVSIX 30.87 0.0570* CFJAX 12.17 0.9099 CGJAX 12.22 0.9081 CSIEX 10.72 0.9533 CSXAX 38.30 0.0081*** DSEPX 7.84 0.9929 GCEQX 17.65 0.6108 MGNDX 14.00 0.8306 MVIAX 25.95 0.1675 NRAAX 14.89 0.7828 PAXDX 17.26 0.6362 PAXLX 24.96 0.2028 PRBLX 35.39 0.0181** PXGAX 19.57 0.4853 REDWX 19.40 0.4957 WSEFX 12.71 0.8894 ARGFX 54.16 0.0001*** CAAPX 66.09 0.0000*** CAMFX 6.41 0.9982 CCAFX 23.33 0.2731 CCVAX 10.45 0.9593 MMSCX 9.91 0.9699 PARMX 25.80 0.1724 PXSAX 27.93 0.1111 TSMDX 16.09 0.7113 WASMX 14.84 0.7857 WASOX 9.99 0.9684 CAAAX 9.51 0.9763 CCLAX 15.64 0.7388

Referenties

GERELATEERDE DOCUMENTEN

In addition, the omitted factors model, the correlated errors model and the single-factor model are regressed and shows evidence that the endogenous factor is

The main goal of this research is to determine whether Dutch fund managers earn abnormal returns compared to what an investor could earn with a passive strategy mimicking a

In France, Germany and Switzerland it is documented that ethical mutual funds outperform the market indices, but that there is still underperformance in

During these periods Dutch mutual funds underperform the benchmark and sector funds have significant higher return than country funds.. Additionally, during sub period 2 sector

Diffuse Optical Spectroscopy Evaluation of Treatment Response in Women with Locally Advanced Breast Cancer Receiving Neoadjuvant Chemotherapy. Using optical spectroscopy

A similar learning process can occur with compositional structures based on neuronal assemblies (in situ representations) in a neural blackboard architecture as illustrated in

*p &lt; .05; **p &lt; .01 NA = Not applicable (CDI was not administered in the group setting, while RSES and CBSA were not administered in the individual setting); CBCL =

The differences between epochs are reduced by the second classi fication step (rule- based reclassi fication), leading to a more consistent classification of point cloud objects