Sometimes you need to look at change to measure effects: An international study on the relation between changes in ESG scores and excess portfolio returns.

(1)

1 * I am grateful for the guidance, helpful comments, suggestions and fast replies from Prof. dr. L.J.R. Scholtens that I received during my thesis process. Also, I am grateful for the RUG and their employees, for the fast transition to online education in these tough Corona times.

Master Thesis

MSc Finance

Faculty of Business Economic Department of Finance*

Sometimes you need to look at change to measure effects:

An international study on the relation between changes in ESG scores and excess portfolio

returns.

Abstract

This thesis studies the relation between changes in ESG scores and excess stock returns. Using a large international sample of 3,646 stocks across North America, Europe and Asia Pacific covering the period of 2003-2019, it is possible to create excess returns with a portfolio that is based on firms that experienced a percentual increase in their ESG score. This can be achieved for the combined regions, North America and Europe, but not in Asia Pacific. Excess returns can be accomplished with a buy & hold strategy and long-short strategy. Only the buy & hold strategy yields statistically significant, but economic insignificant alphas. Portfolios that are based on firms with increased ESG ratings, that have a low or high absolute ESG rating, do not yield excess returns.

Sander Koster s.r.koster.1@student.rug.nl S3838056 04-06-2020 Supervisor: Prof. dr. L.J.R. Scholtens

(2)

1

Table of Content

1. Introduction ... 2

2. Literature review and hypotheses ... 4

2.1 Empirical findings on the relation between CSR and stock returns. ... 4

2.2 ESG scores ... 5

2.3 Asset pricing models ... 6

2.3.1 Capital Asset Pricing model ... 6

2.3.2 Fama and French Three-Factor Model ... 7

2.3.3 Carhart Four-Factor Model ... 8

2.3.4 Fama and French Five-Factor Model ... 8

2.4 Empirical findings ... 9

2.5 Hypotheses ... 11

3. Model, data and method ... 14

3.1 Model ... 14

3.2 Data, portfolio creation and descriptive statistics ... 15

(3)

2 * I am grateful for the guidance, helpful comments, suggestions and fast replies from Prof. dr. L.J.R. Scholtens that I received during my thesis process. Also, I am grateful for the RUG and their employees, for the fast transition to online education in these tough Corona times.

1. Introduction

Environmental, social and governance factors are becoming increasingly popular in the investment world. The websites of major asset managers all contain the option to incorporate ESG in portfolios. According to the U.S. SIF Foundations (2018), socially responsible investing (SRI) has increased from around 2,000 billion dollars in 2003 to almost 12,000 billion dollars in 2018 in the US. Iman Ghosh (2020) notes that sustainable assets have doubled since 2012 across developed countries. In this paper, I will investigate whether socially responsible investing can yield excess returns. I will do so by examining the possibility to generate excess returns, by investing in firms that have experienced a positive change in their ESG score1_{. ESG}

scores can increase with sustainable investments from companies. Therefore, I can capture the relationship between sustainable investments and stock returns, by looking at the relationship between ESG score increases and excess returns. The main question that this thesis revolves around is the following:

Does a portfolio which consists of firms that experienced a change in ESG scores create excess returns? 2

Corporate social responsibility (CSR) can affect stock prices in a few ways. CSR can have a positive effect on the stock price. Porter and Kramer (2011) state that CSR can enhance a company its image. According to Porter (1991), CSR can directly decrease costs in the form of lawsuits and/or production costs. CSR also strengthens the long-term focus of a business. On the other hand, Fernando, Sharfman and Uysal (2017) state that CSR can be expensive. Its costs might not outweigh the benefits. Therefore, CSR can negatively influence the stock price. This will be the case if the costs do not outweigh the benefits. There are only a few empirical studies on the relationship between stock prices and changes in ESG scores. Sahut and Pasquini-Descomps (2015) create a risk factor that captures the change in ESG scores. They find a slight negative relationship between the change in ESG scores and excess returns in the UK. They do not find significant values in the US and Switzerland. The relationship between absolute ESG

1 This approach was inspired by a guest lecturer from Aegon Asset Management. He briefly mentioned that they might have found excess returns with portfolios that incorporated firms with a change in their ESG score.

2_{The long portfolio is based on firms that experienced the largest percentage increases in the sample. The 33%}

(4)

3 scores and stock prices is covered more extensively. Kempf and Osthoff (2007) find that an investor can earn excess returns of 8.7% in the US. Investors can accomplish this by going long in high ESG rated firms and short in low ESG rated firms. This finding is complemented by the study of Dimson, Karakas and Li (2015). They find evidence of a positive abnormal stock return after CSR engagements in the US. Fernando, Sharfman and Uysal (2017) find contradicting effects. The US based research concludes that high ESG rated (green) firms and controversial firms both underperform neutral companies. This implies that investors pay a price for investing in either of the two ends. Becchetti, Ciciretti and Dalò (2016) find a negative relation between stock returns and CSR levels in developed markets. These papers try to investigate the same underlying relation, but they differ in their methodology and sample. Across the studies there are differences in regions, asset pricing models, regression models and ESG measures. All these factors could lead to different outcomes.

This research tries to overcome the differences between these studies to draw a robust conclusion. The focus is on large international sample across developed markets and covers the period 2003-2019. It uses 3,646 unique rated firms across roughly North America, Europe and Asia Pacific. The sample is split up in several ways to draw robust conclusions. The environmental, social and governance pillars are investigated separately. This allows us to check whether score increases have different effects across the pillars. Differences between ESG pillars are found by Sahut and Pasquini-Descomps (2015), Kempf and Osthoff (2007), and Hübel and Scholz (2019). I also split up the sample between high scoring and low scoring firms. This way I can investigate if a change in ESG scores has a different effect for high or low scoring firms. This is necessary, because a sustainable investment can for example have different marginal cost reductions for low scoring firms, compared to high scoring firms. I add industry dummies to the regression method to account for industry fixed effects. This study estimates excess returns with the use of the Capital Asset Pricing Model and the Fama and French Five-Factor Model.

(5)

4 In the following chapter, I review relevant literature. I touch upon the ESG scores and their issues, empirical studies which explain the relation between changes in ESG scores and excess returns, actual empirical findings on the relation between ESG and excess returns and asset pricing models. The hypotheses that I test in this study follow after the literature review. In chapter three, I discuss and motivate the asset pricing models that are used in this study. Next, I explain which data inputs are necessary for this model, how I get them and descriptive statistics on them. After this explanation, I show how I create the portfolios which are used as the dependent variables. The last section in chapter three explains the estimation method that I use and holds tests on assumptions that this method requires. The results of the estimation of the asset pricing models and the created portfolios is given in chapter four. This chapter also holds the tests of the hypotheses. Chapter five holds the conclusion of this research.

2. Literature review and hypotheses

Empirical research is inconclusive about the relationship between ESG scores and stock returns. Some papers find negative effects, some find no effects, and others find positive ones. However, the papers seem to share the same underlying empirical findings that predict the relationship between ESG and returns.

2.1 Empirical findings on the relation between CSR and stock returns.

(6)

5 the cost of capital. If this is the case, the value of the company will of course increase. An important note on risk reduction is that it needs to be value enhancing. If the actions that reduce the risk do not simultaneously increase firm value, then investors would require a lower risk premium for the stock. This will cause the stock price to drop. Thus, if CSR reduces risk, it can reduce the stock price of this firm.

This last condition brings me to other empirical findings that explain why CSR and ESG scores might yield negative stock returns. The base of these findings is Friedman (1962) his work. He states that businesses solely exist to maximize shareholder welfare. ESG related investments often maximize stakeholder welfare. This implies that ESG investments do not maximize firm value. Fernando Sharfman, and Uysal (2017) complement to this argument by finding that socially responsible investments are often value destroying if they go beyond legal requirements. Bauer, Koedijk and Otten (2005) state that an investor faces diversification issues when including ESG dimensions into its portfolio. Therefore, an investor might give up excess returns by including ESG dimensions in its portfolio.

2.2 ESG scores

Before I move to the empirical findings, it is important to define and discuss ESG scores. The following explanations are based on Thomson Reuters ESG scores. This is the database that is used in this study3. ESG scores are widely used by investors to grade firms and to make investing discussions based on socially responsible beliefs. ESG stands for the three pillars that are used to measure the “greenness” of a firm. E stands for environmental, S for social and G for the governance. The environmental pillar broadly grades companies based on resource use, emissions and innovation. The social pillar grades on aspects as the workforce, human rights, community and product responsibility. The governance pillar grades firms based on their management, reporting, shareholder policy and their CSR strategy. Firms are rated within industries and not across industries. This is an important note, because it implies that companies in Shell can have high environmental scores. This might be counterintuitive due to the nature of their business. ESG scores provide a fast and easy way to assess the “greenness” of a firm, but it has its issues. The first problem with ESG scores is that there is no clear definition on ESG and how it should be measured. Semenova and Hassel (2015) find that the three largest databases are highly correlated, this includes Thomson Reuters and KLD. This is positive,

3_{The following link leads to the document from the Thomson Reuters database that explains the scoring}

(7)

6 because it implies that they capture the scores correctly. However, in their research they find a few issues with the environmental score. They conclude that the company specific scores are driven by industry related risks. For this reason, they do not fully capture company specific risk. The scores may therefore be biased per industry and not truly reflect firm specific ESG scores. Berg, Koelbel and Rigobon (2019) investigate all the three ESG pillars among the five largest rating databases. They find that the datasets are correlated between 0.42 and 0.73. This is a low number, when I compare it to the correlation of the S&P 500 and Moody’s credit rating, which is 0.99. This implies that relations between ESG and stock returns may vary a lot depending on which agency is used. These differences between scores arise for 53% from measurement differences, 44% from differences in the scope and 3% for differences in weights. This further indicates that ESG scores might not fully capture actual ESG levels. One important final note on ESG scores is given by Statman (2016). He states that ESG scores do not reflect risk well. This is because some factors greatly influence the scores but not risk and vice versa. Even though ESG scores are not a perfect measure for actual ESG levels, they are still widely used in empirical research. This is mainly due to a lack of consistent definitions and a better alternative.

2.3 Asset pricing models

The empirical papers that investigate the relationship between ESG and stock returns usually use asset pricing models. For this reason, I will first discuss the four most widely used asset pricing models. After this I will discuss the empirical studies on the relation between ESG and stock returns. The CAPM, Fama and French models and the Carhart model are widely used in empirical research. The base of all these models lies within the modern portfolio theory. Markowitz (1952) finds a relation between expected return and risk. Investors are only compensated for systematic risk and they can diversify away unsystematic risk. This diversification principle leads to the construction of the efficient frontier. Investors can maximize their return for a given level of risk with proper diversification.

2.3.1 Capital Asset Pricing model

(8)

7 market portfolio increases with one percentage point, the excess return of the dependent variable increases with beta percentage points, holding everything else constant (ceteris paribus). If the beta is above 1, the portfolio moves in the same direction as the market portfolio. However, it moves more heavily and is riskier than the market portfolio. If beta is below 1 and positive, the dependent variable moves in the same direction as the market portfolio. However, it moves slower than the market portfolio and is less risky. A negative beta is uncommon. It implies that the dependent variable moves in the opposite direction than the market portfolio. Therefore, they contradict the economy. Examples are put options and gold.

2.3.2 Fama and French Three-Factor Model

(9)

8 2.3.3 Carhart Four-Factor Model

Carhart (1997) takes the Fama and French Three-Factor Model and adds in a momentum factor. This factor measures the excess returns of good performing firms over bad performing firms. It controls for the short-term historical outperformance of upward trending stocks over downward trending stocks. The coefficient can be interpreted as follows: if the excess return of good performing stocks over bad performing stocks increases with one percentage point, the excess return of the dependent variable increases with beta percentage points, ceteris paribus. A positive significant beta implies that the dependent variable is shifted towards containing upward trending stocks. Thus, the excess returns are partially explained by the outperformance of upward trending stocks. A negative significant beta implies that the dependent variable contains downward trending stocks. The downward trending stocks are present in the portfolio and their effects outweigh the effects of upward trending stocks. Thus, this underperformance partially explains the excess return of the dependent variable. The momentum factor is highly discussed in practise, but it generally increased the explanatory power of the model.

2.3.4 Fama and French Five-Factor Model

(10)

9 conservatively investing firms are significantly present in the portfolio. Thus, their outperformance outweighs the underperformance of aggressively investing firms. This effect partially explains the excess return of the dependent variable. A negative beta implies that firms in the portfolio are shifted towards being aggressively investing firms. Their underperformance partially explains the excess return of the dependent variable. Although this model is subject to criticism, the F&F 5-Factor Model has a higher explanatory power than the F&F 3-Factor Model. Fama and French chose not to include a momentum factor, because their models fails to predict a momentum premium. The model is yet to fully prove itself. However, as we have learned from the past, criticism is normal for a new model and it takes years for the world to accept it.

2.4 Empirical findings

(11)

10 Like Hübel and Scholz, Bechetti, Ciciretti and Dalò also create risk measures. They use the 6 VIGEO CSR pillars to create CSR risk measures. They study the developed markets Europe, North America and Asia Pacific. They use 600 stocks per region from the STOXX global 1800 index and research the time frame from 2005 to 2014 and use the Carhart Four-Factor Model. The investigated regions, the different asset pricing model and the different ESG database make this research different from that of Hübel and Scholz. Next to this, the time frame that this paper uses might be biased due to the financial crisis. The effects of for example the flight to quality could bias the results in this study. Bechetti, Ciciretti and Dalò find that CSR levels are negatively related to stock returns. This complements on the negative relation between environmental risk and stock returns that Hübel and Scholz find. It implies that increases in CSR lead to lower stock returns. If I assume that increases in ESG scores are the result of increased CSR levels, then I can expect a negative relation between ESG score increases and excess stock returns.

Fernando, Sharfman and Uysal (2017) do not create a risk measure, but they create various portfolios and look at alphas to find excess returns. Their research focusses on US firms in the time frame 1997-2007. They gather ESG data from the KLD data set. They create three portfolios. A toxic portfolio, a neutral portfolio and a green portfolio. The toxic portfolio consists of firms with high controversy scores. The green portfolio consists of stocks with a high ESG rating. The neutral portfolio consists of the stocks in between the other two portfolios. They find that both ends do not outperform neutral firms and thus that there is no risk premium for toxic or green firms. This contradicts the finding from Hübel and Scholz, who find that there is a risk premium for toxic firms over green firms in the environmental pillar. This could be caused by differences in the sample. The papers use different regions, models, time frames and ESG databases. It could also imply that there is a risk premium for toxic firms over green firms, but not for toxic firms over neutral firms. Fernando, Sharfman and Uysal find that industry fixed effects are present and should be controlled for. This seems logical, because a high ESG rating can be more valuable for an industrial firm, compared to a technology firm, because it can for example lead to a higher cost reduction.

(12)

11 factor based on changes in ESG scores, instead of absolute scores. They find a slightly negative relations between ESG score increases and excess returns in the UK. No significant coefficients were found in the US or Swiss subsample. The study uses 618 Swiss, 1,335 UK and 8,039 monthly US observations. Cutting the amount of observations by the seven dimensions leaves a small sample size in Switzerland and the UK. Also, this study takes place mainly in the financial crisis, which can bias the results. Therefore, I must be careful when interpreting the results. Nonetheless, this research suggests that regional differences are present and should be accounted for in future studies. Also, the results indicate that I can expect a negative relation between increases in ESG scores and stock returns.

The previously discussed papers find either no significant relation between ESG and stock returns or a negative relation. Kempf and Osthoff (2007) find a positive relationship between ESG and stock returns. They focus on US firms in the time frame 1992-2004. They create portfolios based on firms with the highest 10% ESG scores and the lowest 10% ESG scores. The ESG data is gathered from the KLD dataset. By going long in the high ESG scoring firms and short in low ESG scoring firms, they can achieve excess returns of up to 8.7% per year. Excess returns can be achieved with a best in class strategy4 and positive screening5. They find this with the use of the Carhart 4-Factor Model. These results imply that the relation between low ESG scores and stock returns is very different from the relation between high ESG scores and stock returns. The difference between the findings of this study result and the results from the former discussed papers are large. The difference likely lies in the fact that this paper defines the low ESG scoring firms as actual low ESG scoring firms. The previous papers, like Fernando, Sharfman and Uysal, compare high ESG scoring firm to controversial firms. Next to this, the investigated region, the rating database and the time frame also differ, which might partially explain the different results.

2.5 Hypotheses

I want to answer the following main research question: “does a portfolio which consists of firms that experienced a change in ESG scores create excess returns?”. The results from the discussed empirical papers are inconclusive on expectations on this question. Kempf and Osthoff (2007) and Fernando, Sharfman and Uysal (2017) do not, unlike the other studies, create risk measures.

4_{The best in class strategy in this sense applies to the inclusion of the highest ESG rated firms in a portfolio, per}

sector. Thus, the firms hold the highest ESG rated firms, equally weighted across all sectors.

5_{Positive screening applies to the inclusion of firms with a high ESG rating or a particularly good score in for}

(13)

12 This is also not the goal of my research. However, they measure the relationship between ESG scores and stock prices, not the relation between changes in ESG scores and stock prices. The only paper that investigates the latter relation is written by Sahut and Pasquini-Descomps (2015). I choose to base my expectations of the results for the main research question on these three papers, because these are the most similar to this study. Kempf and Osthoff find a positive relation, Sharfman and Uysal find no relation and Sahut and Pasquini-Descomps find no relation or a slightly negative one in the UK. The slight negative relation in the UK is based on a relatively small sample and is found by using more and different ESG dimensions that Kempf and Osthoff, Fernando, Sharfman and Uysal and I will use. Therefore, I will expect a positive relation between an increase in firms their ESG scores and excess stock returns. I will test this with the following hypothesis:

A portfolio which consists of firms that experienced an increased6_{ESG score yields excess}

returns (hypothesis 1)

In this study, I aim to make a robust conclusion on the main research question. To achieve this, I will try to overcome differences between the discussed empirical papers, which might influence the result on the main research question. The first thing that I will investigate next to the main research question is the relationship between the separate ESG pillars and excess returns. Hübel and Scholz (2019) find different relations between the separate ESG pillars and stock returns. This seems logical when you think about it. Take for example an industrial firm. A large investment to reduce CO2 emissions can have a very different impact on this firm than increased reporting on its CO2 emissions. I will investigate the three pillars separately by testing the following hypotheses:

A portfolio which consists of firms that experienced an increased environmental score yields excess returns (hypothesis 2)7

A portfolio which consists of firms that experienced an increased social score yields excess returns (hypothesis 3)

A portfolio which consists of firms that experienced an increased governance score yields excess returns (hypothesis 4)

Furthermore, I will supply additional evidence whether the excess returns of firms that experienced and increase in their ESG score, are significantly different across the three individual ESG pillars. This will be tested with the following hypothesis:

6_{Increases in hypothesis 1 are the 33% firms that experienced the largest increase in their ESG score.}

(14)

13

The excess returns of #pillar8 based portfolios are significantly different from the excess returns of #pillar-based portfolios (hypothesis 5)

The second aspect that I will investigate, is the possible difference in low ESG scoring firms and high ESG scoring firms. Kempf and Osthoff (2007) find that high scoring firms outperform low scoring firms. Next to this, Fernando, Sharfman and Uysal, but also Hübel and Scholz, find that toxic, neutral and green firms have a different relation to stock returns. This makes sense if you look at the research from Porter (1991) that states that CSR can create value when it directly reduces costs in the production process or from lawsuits etc. On the other hand, CSR can be value destroying when it exceeds regulatory limits. A low ESG scoring firm faces more ESG risk than a high ESG scoring firm in the same industry. An increased ESG score can therefore lead to higher marginal cost reductions for low scoring firms, than an increased ESG score for a high scoring firm. In this research I refer to low absolute rated firms as “toxic” and high absolute rated firms as “green”. I will investigate the different relation between excess returns compared to toxic and green firms by testing the following hypotheses:

A portfolio which consists of toxic firms that experienced an increased ESG score yields excess returns (hypothesis 6)9

A portfolio which consists of green firms that experienced an increased ESG score yields excess returns (hypothesis 7)

Furthermore, I will test if the excess returns of toxic and green firms are significantly different from each other. This will supply additional evidence whether the excess returns differ from toxic firms that experienced an increased ESG score and green firms that experienced an increased ESG score. I will test this with the following hypothesis:

The excess returns of toxic firms are significantly different from the excess returns of green firms (hypothesis 8)

The empirical papers focus on developed markets. However, only Bechetti, Ciciretti and Dalò incorporate all the developed markets into their research. Regional differences could have an impact on the results across regions, because of cultural differences or different motivations for SRI. North America is known to have a philanthropic motivation for SRI, but SRI is demanded by the community in Europe. Cultural differences could therefore be reflected in investment decisions. For this same reason it can bias the relation between ESG scores and stock returns. Next to this, Kitzmueller and Shimshack (2012) state that differences in

8_{#pillar stands for the environmental, social and governance pillars respectively. Thus, this hypothesis is tested}

for all three ESG pillars. #pillar is used to improve the flow of the paper.

(15)

14 regulation can also bias ESG levels. A country with high regulatory pressure on ESG aspects could have far higher base ESG levels. I will investigate the different relation between excess returns and ESG score increases across countries by testing the following hypothesis:

A portfolio which consists of Northern American firms that experienced an increased ESG score yields excess returns (hypothesis 9)10

A portfolio which consists of European firms that experienced an increased ESG score yields excess returns (hypothesis 10)

A portfolio which consists of Asian Pacific firms that experienced an increased ESG score yields excess returns (hypothesis 11)

Furthermore, I will test if the excess returns of toxic and green firms are significantly different from each other. This will supply additional evidence whether the excess returns differ across the developed countries. I will test this with the following hypothesis:

The excess returns of #region-based11_{portfolios are significantly different from the excess}

returns of #region-based portfolios (hypothesis 12)

While only Fernando, Sharfman and Uysal (2017) test for industry fixed effects and find that they are present, all the papers control for it. They generally do so by equally weighting the portfolios across industries. Industry fixed effects can influence the relation between excess returns and increases in ESG scores. First, Semenova and Hassel (2015) find that ESG scores sometimes reflect industry risk levels more than firm specific risks. Next to this it seems likely that the CO2 reduction in an industrial firm can have different effects than a CO2 reduction in an information technology firm. I will control for industry fixed effects by incorporating sector dummies into the regression.

3. Model, data and method

3.1 Model

In the literature section, I showed that asset pricing models like the Fama and French models are widely used to price stocks and portfolios. In this research, I will use the Fama and French Five-Factor Model and the CAPM as the asset pricing models. Fama and French (2017) find that the F&F 5-Factor Model has the highest explanatory power of the Fama and French models.

10_{Increases are defined by the increases which are stated in hypotheses 1,2,3,4,6,7}

11_{#region-based stands for the North American, European and Asian Pacific region respectively. Thus, this}

(16)

15 A momentum factor will not be included, because Fama and French (2015) state that there lacks evidence, that a momentum factor is a determinant of the expected return. The CAPM is used to estimate the results for the combined regions. Fama and French factors are not available for these regions combined, therefore I must use the CAPM. As mentioned in the literature/hypotheses section, I control for industry fixed effects by including sector dummies in the model. Table A1 in the appendix shows that industry fixed effects are present. The 11 GICS sectors are used in this study. Table A2 in the appendix shows that the largest sector in the sample is the industrials sector. I choose to exclude the largest industry in the model, to prevent the dummy variable trap. This way, the included dummies in the model are set-off against the industrials sector. The models are as follows:

CAPM (model 1)

𝑅_𝑖𝑡 − 𝑅_𝐹𝑡 = 𝛼_𝑖 + 𝛽_1𝑖(𝑅_𝑀𝑡− 𝑅_𝐹𝑡) + 𝛽_2𝑖𝐶𝑀𝑆_𝑡+ 𝛽_3𝑖𝐶𝑆𝐷_𝑡+ 𝛽_4𝑖𝐶𝑆𝑆_𝑡+ 𝛽_5𝑖𝐸𝑁𝐺_𝑡+ 𝛽6𝑖𝐹𝐼𝑁𝑡+ 𝛽7𝑖𝐻𝑇𝐶𝑡+ 𝛽8𝑖𝐼𝐹𝑇𝑡+ 𝛽9𝑖𝑀𝐴𝑇𝑡+ 𝛽10𝑖𝑅𝐸𝑆𝑡+ 𝛽11𝑖𝑈𝑇𝑇𝑡+ 𝜀𝑖𝑡 12 (1)

F&F 5-factor model (model 2)

𝑅_𝑖𝑡 − 𝑅_𝐹𝑡 = 𝛼_𝑖 + 𝛽_1𝑖(𝑅𝑀𝑡− 𝑅𝐹𝑡) + 𝛽2𝑖𝑆𝑀𝐵𝑡+ 𝛽3𝑖𝐻𝑀𝐿𝑡+ 𝛽4𝑖𝑅𝑀𝑊𝑡+ 𝛽5𝑖𝐶𝑀𝐴𝑡+

𝛽_7𝑖𝐶𝑀𝑆_𝑡+ 𝛽_8𝑖𝐶𝑆𝐷_𝑡+ 𝛽_9𝑖𝐶𝑆𝑆_𝑡+ 𝛽_10𝑖𝐸𝑁𝐺_𝑡+ 𝛽_11𝑖𝐹𝐼𝑁_𝑡+ 𝛽_12𝑖𝐻𝑇𝐶_𝑡+ 𝛽_13𝑖𝐼𝐹𝑇_𝑡+ 𝛽14𝑖𝑀𝐴𝑇𝑡+ 𝛽15𝑖𝑅𝐸𝑆𝑡+ 𝛽16𝑖𝑈𝑇𝑇𝑡+ 𝜀𝑖𝑡13 (2)

3.2 Data, portfolio creation and descriptive statistics

Each of these variables require data inputs. The data is gathered from two databases. The Fama and French factors are downloaded from Kenneth R. French his database14. I take the yearly factors for developed markets North America, Europe and Asia Pacific. I choose to use yearly

12_{Rit-RFt = the excess return of the portfolio, ai = the constant/alpha, Rmt – RFt = the excess return of the}

market portfolio, CMS = the dummy for the communication services sector, CSD = the dummy for the Consumer Discretionary sector, CSS = the dummy for the Consumer Staples sector, ENG = the dummy for the Energy sector, FIN = the dummy for the Financials sector, HTC = the dummy for the Health Care sector, IFT = the dummy for the Information Technology sector, MAT = the dummy for the Materials sector, RES = the dummy for the Real Estate sector, UTT = the dummy for the Utilities sector, 𝝴it = the error term

13_{Rit-RFt = the excess return of the portfolio, ai = the constant/alpha, Rmt – RFt = the excess return of the}

market portfolio, SMBt = the excess return of small-cap stocks over large-cap stocks, HMLt = the excess return of high book-to-market ratio forms over low book-to-market ratio firms, RMW = the excess return of robust operating profitability firms over weak operating profitability firms. CMA = the excess return of conservatively investing firms over aggressively investing firms. CMS = the dummy for the communication services sector, CSD = the dummy for the Consumer Discretionary sector, CSS = the dummy for the Consumer Staples sector, ENG = the dummy for the Energy sector, FIN = the dummy for the Financials sector, HTC = the dummy for the Health Care sector, IFT = the dummy for the Information Technology sector, MAT = the dummy for the Materials sector, RES = the dummy for the Real Estate sector, UTT = the dummy for the Utilities sector, 𝝴it = the error term

(17)

16 factors, because my statistics software is unable to handle the dataset with monthly observations. Fama and French use the US one-month treasury bill rate as the risk-free rate. This method is copied in this study. The US one-month T-bill is considered to be the most risk-free asset. The German bund would be a good alternative for Europe. However, the German bund is more often negative in the last three years. This is not representative as a true risk-free rate, because it does not represent the long-term growth rate of the economy. This issue could be tackled by taking the 30-year German bund. However, this biases the early years, because this rate is very high in the early 2000’s. This is also not representative for the long-term growth rate of the economy. The last motivation in the use of the US T-bill is that there is no true truly risk free and thus comparable treasury bill in Asia Pacific. I choose the US one-month treasury bill as the risk-free rate for all regions. I do so for the sake of consistency and because it is a true risk-free rate that breaks the negativity constraint less often than other options. It is not a perfect risk-free measure, but it is the most applicable one in this context. I gather stock data, firm specific data and ESG scores from the Thomson Reuters Eikon database. Stock data is used to create the portfolios (dependent variables). I select stocks from primary exchanges from countries in each region. Primary exchanges can be selected in Thomson Reuters Eikon database. They contain the stocks of national firms or firms that have their headquarters in the country of the primary index. This makes it easier to find unique15 firms across countries. Also, it prevents a company like Royal Dutch Shell, which is present on multiple indices across countries, to be included in the sample multiple times. Table A3 in the appendix shows which countries and exchanges are used. If a primary exchange did not contain a representative amount of stocks16, I switched to a non-primary exchange. I made sure that only unique firms are left in the sample. The stock data is in the form of the adjusted close price. The adjusted close price accounts for stock splits etc. which would otherwise influence the stock price without altering the market value. For all the firms that I have gathered stock data on, I also gather company information. For each firm, I take the company code, company common name, ISIN code, GICS sector name, GICS sector code, Country of headquarters and the Currency. The GICS sector information is used to create the sector dummies. Finally, I retrieve the environmental pillar score, social pillar score, governance pillar score and the ESG overall score for each of these firms. The environmental pillar broadly grades firms based on resource use,

15_{A unique firm is present only once in the sample. It is therefore only present on one index in the sample.} 16_{A sufficient amount of stocks is at least 50 stocks for countries smaller European countries and New Zealand,}

(18)

17 emissions and innovation. The social pillar grades firms based on aspects like the workforce, human rights, community and product responsibility. The governance pillar grades firms based on their management, reporting, shareholder policy and their CSR strategy. The ESG overall score combines the scores of the three pillars and weights them equally. In table A4 in the appendix, you can find the commands that are used to retrieve the data from the Thomson Reuters Eikon database. I use the ESG scores to create the portfolios. All the data is retrieved for the period 2003-2019. The portfolios range from 2004 until 2019. 2003 logically drops out, because I calculate yearly percentage changes. 2003 serves as a base year. This timeframe fully makes use of the ESG scores provided by Thomson Reuters. This database has his earliest scores in 2003. Thomson Reuters Eikon database provides an increasing amount of rated companies each year.

In this study, I follow Kempf & Osthoff (2007) in applying the buy & hold strategy and the long-short strategy to measure portfolio performance. The buy & hold strategy17_{is a passive}

investment strategy where one buys stocks and holds them for a long period. It is an easy strategy to incorporate and allows one to measure the long-term performance of a stock or portfolio. I will use it to investigate whether firms that experienced an increase in their ESG score outperform the market. The long-short strategy18 is a strategy where one goes long in a stock that is perceived to be under-priced and short in a stock that one believes is currently overpriced. According to the efficient market hypothesis by Fama (1970), all stocks should converge to the market equilibrium. This means that overpriced stocks will lose value and the under-priced stocks will gain value. In this research, I use the long-short strategy to investigate if it is possible to generate excess returns, by going long in firms that experienced an increase in their ESG score and short in firms that experienced a decrease in their ESG score. Hereby I am able to investigate whether the stock price of increased ESG scoring firms increase while the stock price of decreased ESG scoring firms decrease. If I assume that CSR leads to changes in ESG scores, then this might imply that CSR both negatively and positively affects stock prices. Before going into the portfolio creation, it is important to note that the portfolios must hold at least 20 stocks per year. Upson, Jessup and Matsumoto (1975) find that portfolio managers should diversify among more than 16 stocks, to produce a well-diversified portfolio.

17_{The definition and explanation of the buy & hold strategy can be found on:}

https://www.investopedia.com/terms/b/buyandhold.asp

18_{The definition and explanation of the long-short strategy can be found on:}

(19)

18 I will include a minimum of 20 stocks in each yearly portfolio to account for diversification. I create three kinds of portfolios for which I measure the excess returns:

1. E-S-G pillar portfolios, Buy & Hold strategy

The first kind of portfolios are the environmental, social, governance and total ESG score portfolios. These are created for the combined regions and the separate regions. I do this in the following way. First, I calculate the log stock returns and percentage changes in the E, S and G pillars. Secondly, I take the 33% firms with the highest positive percentage change in their pillar scores. A 33% cut-off is common practise when creating a long or short portfolio in academic research. These cut offs are used by for example Fama and French (1993), Fama and French (2015), Becchetti, Ciciretti and Dalò (2017) and by Hübel and Scholz (2019) I repeat this process for each year in the sample, thus the portfolios are yearly rebalanced and are held for one year. These portfolios are used to answer hypotheses one to five.

2. Toxic and Green portfolios, Buy & Hold strategy

(20)

19 observations due to a lack of ESG rated firms. Furthermore, far less rated companies are provided in the early years for Asia Pacific, compared to more recent years. I break the 20-stock constraint for Asia Pacific more often than in the other regions. This is even the case in the years 2008 until 2019 in Asia Pacific. For this reason, I was unfortunately unable to create toxic and green portfolios for Asia Pacific

3. Short portfolios, long-short Strategy

I compose the short portfolios for the long-short strategy. The goal of the long-short strategy is to measure if I can create excess returns by going long in positive percentage changes in ESG scores and short in negative percentage changes in ESG scores. The short portfolios are composed as opposites from long portfolios that are used in the buy & hold strategy. I take the 33% highest positive ESG percentage changes in the buy and hold. I take the 33% highest negative ESG percentage changes in the short portfolios. For the toxic and green portfolios, I take firms with negative changes in ESG scores larger than 5 percentage points into the portfolio. The toxic and green short portfolios are created by taking all firms with a decrease of more than 5 percentage points in their ESG score, from the 33% lowest absolute scoring (toxic portfolio) or the highest 33% highest absolute scoring (green portfolio) firms.

(21)

(22)

(23)

22 Table 3 shows the descriptive statistics on the Fama and French Five-Factor Models. I observe slight variations between the variables. The MktRf variable seems to have the highest volatility. This variable shows the highest values in Europe. When I look at the mean value of SMB, it hints that it is in general higher in North America. HML its mean value is higher in Europe.

(24)

23 across countries. This can be found in table A7 in the appendix. The long-short portfolios are highly correlated to other long-short portfolios, as can be seen in table A8 and A9 the appendix. This does not hold for the combined region portfolio. When I compare the long-short portfolios with the long portfolios in table A10 and A11 in the appendix, I generally observe negative correlations between long-short portfolios and long portfolios in North America and Europe. This implies that the long-short strategy might yield different returns than the buy & hold strategy. The long-short portfolios and the long portfolios are generally not correlated across the combined regions and Asia Pacific. This might imply that the long-short strategy has a different result than the buy & hold strategy in the combined regions and Asia Pacific.

3.3 Estimation method

(25)

24

4. Results

Tables 4, 5 and 6 show the results of the estimation of the excess returns for the different portfolios. The estimation results in table 4 relate to hypothesis one. Table 5 holds the estimation results for test hypotheses two to five. The estimation results in table 6 are aimed to test hypotheses six, seven and eight. From these three tables combined, and additional information in the appendix, I will test the regional difference hypotheses nine to twelve.

(26)

(27)

(28)

(29)

28 firms over weak operating profitability firms. From the statistically significant and negative beta of CMA, I can conclude that the ENV, SOC and GOV portfolio consists of firms that are shifted towards being aggressively investing firms, in North America. For the European portfolios, I find statistically significant betas for SMB in all three portfolios and HML for the social and governance portfolios. This statistically significant coefficients of above 1 implies that the excess returns are partially explained by the size effect. Also, it means that the ENV, SOC and GOV portfolios consists of firms that are shifted towards being small-cap stocks. The excess returns in the governance and social European portfolios are also partially explained by the underperformance of growth firms over value firms. This implies that the firms in the SOC and GOV portfolios are shifted towards being growth firms. The Fama and French factors do not have statistically significant betas for the social and governance portfolios in Asia Pacific. The SMB, HML and RMW factors do have statistically significant betas for the environmental portfolio in this region. It implies that the excess returns of this portfolio are partially explained by the size effect, the outperformance of value firms and the outperformance of robust operating profitability firms. The alphas are statistically significant for all three portfolios in the combined regions and the Northern American region. The alphas are also statistically significant for the environmental and social portfolio in Europe. They are insignificant for the governance portfolio in Europe and for all three portfolios in Asia Pacific. The high and significant F&F factors and the significant alphas, imply that it is possible to create excess returns with portfolios based on firms that experienced the highest 33% percentual increase in their environmental, social or governance score, with a buy & hold strategy. This is possible for all three in the combined regions, North America and the environmental and social portfolios in Europe. The alphas are economically insignificant.

(30)

(31)

(32)

(33)

(34)

(35)

34 America and Europe, for the social low and high absolute rated portfolios in the combined regions and North America and finally for the governance low and high absolute rated portfolios in the combined regions and North America. This allows me to accept hypothesis 8, which states that the high and low absolute rated portfolios are different from each other. I accept this hypothesis for North America, for three out of four portfolios for the combined regions and for the environmental pillar in Europe.

(36)

(37)

36

5. Conclusion

This research focusses on the question whether it is possible to create excess returns with a portfolio which consists of stocks that experienced an increase in their ESG scores. The underlying idea of this research question is the relation between corporate social responsibility levels and excess returns. Porter (1991) states that CSR can enhance firm value by reducing production costs, CSR can prevent costs from future lawsuits and penalties. Next to cost reductions, CSR can also strengthen a firm its long-term focus which will reduce risk. A company is able to enhance its image by taking into account all stakeholders, according to Freeman (1984) and CSR can enhance a company its image. ESG scores should increase with CSR.

With the use of the Capital Asset Pricing Model and the Fama and French Five-Factor Model, I estimate the excess returns of portfolios which consist of firms with an increased ESG score. I make use of the buy & hold strategy and long-short strategy as trading strategies. The buy & hold strategy measures the long-term performance of the portfolio. The long-short strategy measureS excess returns by going long in increased ESG scoring firms and short in decreased ESG scoring firms. This provides additional evidence on the relation between ESG score changes and excess returns. The long portfolios (buy & hold portfolios) consist of the 33% firms that experienced the largest percentual increases in their ESG score. The short portfolios consist of the 33% firms that experienced the largest percentual decrease in their ESG score. The portfolios are rebalanced on an annual basis.

The estimation results of the CAPM and the F&F 5-Factor Model, for these portfolios, are given in table 4. I can conclude that it is possible to generate excess returns from a portfolio

which consists of firms that experienced an increase in their ESG rating. This can be achieved with the buy & hold strategy and the long-short strategy, in the combined regions, North America and Europe. The statistically significant betas for MktRF in both

(38)

37 over large-cap stocks, high book-to-market firms over low book-to-market firms and robust operating profitability firms over weak operating profitability firms. The portfolio excess returns are subject to the underperformance of aggressively investing firms over conservatively investing firms. Thus, the firms in this portfolio are shifted towards being aggressively investing firms. This explanation applies to both trading strategies. In Europe, I observe that the excess returns in the buy & hold strategy are accomplished by the size effect. The firms in this portfolio are shifted towards being small-cap stocks. The excess returns are subject to the underperformance of growth firms. Thus, the firms are also shifted towards being growth firms. The underperformance of growth firms is offset by the short portfolio. This can be observed from the insignificant HML beta in the long-short strategy. The insignificant beta implies that the short portfolio is subject to the underperformance of growth firms, but profits from this underperformance by going short in these firms.. The excess return of the European long-short portfolio is further explained by the size effect (SMB) and the outperformance of robust operating profitability firms over weak operating profitability firms (RMW). These finding are in line with the findings of Kempf and Osthoff (2007), who find the possibility to achieve excess returns by going long in high ESG rated portfolios and short in low ESG rated portfolios. It contradicts the finding of Sahut and Pasquini-Descomps (2015), who find a negative relation between increases in ESG scores and excess returns.

The possibility to create excess returns from firms that experienced an increase in their ESG rating can be used to incorporate ESG aspects into a portfolio. On average, firms like this are shifted towards being small-cap firms, value firms and robust operating profitability firms, which leads to outperformance. Thus, by including these firms in your portfolio, you are able to support firms that achieved an increased ESG rating by incorporating them in your portfolio, without giving up potential excess returns. This is possible for the combined region, North American and Europe.

(39)

38 returns. For both the buy & hold strategy and the long-short strategy, no excess returns are found. Tables 6 and 7 show that these portfolios yield poor excess returns. This is in line with the finding of Fernando, Sharfman and Uysal (2017), who find that there is no premium for toxic and green firms compared to their neutral counterparts.

The economic explanation for the excess returns is not reflected by the alphas of the portfolio. However, the explanation of the excess returns comes from the historical outperformance of small-cap companies, high book-to-market firms and robust operating profitability firms. It is important to note that the relation between actual ESG levels/changes and excess returns could be biased, because ESG scores might not capture actual ESG levels, according to Semenova and Hassel (2015). Future studies could investigate the relation between real ESG investments to overcome the bias that arises from ESG scores. Next to this, a future study could extend this research by including time fixed effects. Hübel and Scholz (2019) find that time fixed effects are present for the social pillar. According to them, this can be explained by the flight to quality that arises in recession periods. One final remark for future studies is that one should create a benchmark portfolio from the sample. This allows one to be better able to compare the estimation results of the asset pricing models and conclude about the relative economic power of the Fama and French factors.

(40)

39

6. Bibliography

Bauer, R., Koedijk, K., Otten, R., 2005. International evidence on ethical mutual fund performance and investment style. Journal of Banking and Finance 29, 1751–1767.

Becchetti, L., Ciciretti, R., Dalò, A., 2017. Fishing the Corporate Social Responsibility Risk Factors. CEIS Tor Vergata 14(3).

Berg, F., Koelbel, J.F., Rigobon, R., 2019. Aggregate Confusion: The Divergence of ESG Ratings. Working paper. MIT Sloan, Cambridge.

Fama, E., 1970. Efficient capital markets: a review of theory and empirical work. Journal of Finance 25, 383-417.

Fama, E.F., French, K.R., 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33, 3-56.

Fama, E.F., French, K.R., 2015. A five-factor asset pricing model. Journal of Financial Economics 116, 1-22.

Fama, E.F., French, K.R., 2017. International tests of a five-factor asset pricing model. Journal of Financial Economics 123(3), 441–463.

Fernando, C.S., Sharfman, M.P., Uysal, V.B., 2017. Corporate Environmental Policy and Shareholder Value: Following the Smart Money. Journal of Financial and Quantitative Analysis 52, 2023-2051.

Freeman, R.E., 1984. Strategic management: A stakeholder approach. Cambridge University press, Cambridge.

Friedman, M., 1970. The Social Responsibility of Business is to Increase its Profits. New York Times Magazine, 173-178.

Hübel, B., Scholz, H., 2019. Integrating sustainability risk in asset management: The role of ESG exposures and ESG ratings. Working paper, University of Nürnberg.

Kempf, A., Osthoff, P., 2007. The Effect of Socially Responsible Investing on Portfolio Performance. European Financial Management 13(5), 908–922.

(41)

40 Kurtz, L., 2005. Answers to Four Questions. Journal of Investing 14(3), 125–139.

Markowitz, M., 1952. Portfolio Selection. The Journal of Finance 7(1), 77-91. Porter, M.E., 1991. America’s green strategy. Scientific American 264 (4), 168.

Porter, M.E., Kramer, M.R., 2011. The big idea: created shared value. Harvard Business Review January-February, 4-17.

Sahut, J.M., Pasquini-Descomps, H., 2015. ESG Impact on Market Performance of Firms: International Evidence. Management international 19(2).

Semenova, N., Hassel, L., 2015. On the Validity of Environmental Performance Metrics. Journal of Business Ethics 132(2), 249-258.

Sharpe, W.F., 1964. Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance 19, 425-442.

Statman, M., 2016. Classifying and Measuring the Performance of Socially Responsible Mutual Funds. The Journal of Portfolio Management 42(2), 140-151.

(42)

41

7. Appendices

(43)

(44)

(45)

(46)

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

(56)

(57)

(58)