• No results found

Herd is the word

N/A
N/A
Protected

Academic year: 2021

Share "Herd is the word"

Copied!
61
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UNIVERSITEIT VAN AMSTERDAM

Herd is the Word

by

Barry Coleman

A thesis submitted in partial fulfillment for the

degree of Master of Actuarial Science and Mathematical Finance

in the

Faculty of Economics and Business Universiteit van Amsterdam

(2)

This document is written by Student Barry Coleman who declares to take full respon-sibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of com-pletion of the work, not for the contents.

(3)

”The reaction of one man can be forecast by no known mathematics; the reaction of a billion is something else again”

(4)

Abstract

Faculty of Economics and Business Universiteit van Amsterdam

Master of Actuarial Science and Mathematical Finance

by Barry Coleman

In this paper, we examine the herd behaviour index (HIX) portfolio optimisation tech-niques using historical data. The HIX is used to measure the market perception concern-ing the degree of dependency that exists between a set of random variables, representconcern-ing different stock prices. In the model, rational herding arises because of information-event uncertainty. The construction of this measures is based on the theory of comonotonicity. We investigate multi-period portfolio selection problems constrained by a ”long-only” portfolio problem and target return for a basket of m risky securities that are traded continuously. First, we review risk measures and consider the portfolio selection prob-lem of a decision maker who invests money at predetermined points in time in order to obtain a target capital at the end of the time period under consideration. By minimising risk measures through the optimal allocation of wealth we compare the market perfor-mance of each portfolio. We use a sample of ten stocks from the S&P 500 to find that between 2005 to 2015 that the minimized HIX portfolio out-performs minimized vari-ance portfolios and minimized Value-at-Risk portfolios. After the 2008 financial crisis, we observe a high level of herd behaviour until 2012, this is when the minimized HIX portfolio performs best.

(5)

Acknowledgements

I want to thank my supervisor from the University of Amsterdam, Dani¨el Linders for guidance, patience, reviewing my thesis, answering my questions and providing valuable input.

I would like to take the opportunity to thank my friends for their support, endless joy, and making Amsterdam a special place to me. With Karen Lo, Ngoni Fungura and Stanis law Laniewski, to name a few, time seemed to fly by at the UvA.

And last but not least I would like to thank my parents, Polly and John, and my sisters for their unconditional love, support and faith in me. I credit all the good fortune in my life to them.

(6)

Declaration of Authorship i Abstract iii Acknowledgements iv 1 Introduction 1 1.1 Motivation . . . 1 1.2 Research Question . . . 3

2 Tail Transforms, Convex Order and Comonotonicity 5 2.1 Tail Transforms . . . 5

2.2 Convex order and comonotonicity . . . 6

2.3 Multivariate Dependence Measures . . . 8

2.3.1 Definitions and Properties . . . 9

3 The Stock Market 12 3.1 The stock market index . . . 12

3.2 Behavioral Finance . . . 12

3.2.1 Prospect Theory . . . 13

3.2.2 Prospect Theory in Stock Markets . . . 14

3.2.3 Systemic Risk. . . 15

4 Portfolio Theory 16 4.1 Modern Portfolio Theory . . . 16

4.2 Optimal Portfolio Design: Basic Markowitz Mean-Variance Framework . . . 17

4.2.1 Independent Assets. . . 18

4.2.2 Dependent Assets . . . 19

4.3 The Eficient Frontier . . . 20

4.3.1 A Simple Example . . . 20

4.3.2 Risk-free rate and leverage . . . 23

4.4 Mean-Variance Optimization for the long only portfolio problem . . . 25

4.4.1 Risk Minimization for N Risky Assets . . . 27

4.5 Drawback of Modern Portfolio Theory: Dimensionality and parameter estimation . . . 29

(7)

Contents vi

5 Value at Risk 30

5.1 Measuring VaR for a portfolio . . . 31

5.2 Minimizing VaR for N Risky Assets . . . 31

6 Herd Behavior Index 33

6.1 Creating a comonotonic sample . . . 33

6.2 Finding the Market HIX . . . 35

6.3 HIX Minimization for N Risky Assets . . . 35

7 Results 38

7.1 The long only portfolio problem. . . 38

7.2 Price Indices for long only portfolios . . . 40

7.3 Index performance and Market HIX . . . 43

8 Conclusion 45

8.1 Summary . . . 45

8.2 Discussion . . . 46

A Appendix 48

A.1 A useful covariance matrix decomposition . . . 48

A.2 Logarithmic returns versus simple returns . . . 49

A.3 Coherent Risk Measures . . . 50

(8)

Introduction

1.1

Motivation

Behavioural science is an ever growing interest in financial markets. Despite access to a endless amounts of information, advanced models and algorithms humans are still prone to making irrational decisions. The field of behavioral finance is said to have developed in response to a host of anomalies that cannot be explained by traditional financial models. We will focus on what propelled investors to behave irrationally in uncertain ventures - the reassurance from seeing so many others do the same thing. Due to the nature and intensity of financial markets, participants are subject to the ”law of mental unity of crowds”, meaning that clustering structures will always be present in the market. This behavior is known as herd mentality and describes how people are influenced by their peers to adopt certain behaviors. Humans are prone to herd mentality, but if one can recognize the herd and examine it rationally, one will be less likely to follow the stampede when it’s headed in a dangerous direction [1].

By and large, herd mentality is a phenomenon where individual members of a group subvert their will to the added safety and protection of that group. In the animal king-dom, the herd is personified through biological groupings. [2]. Participants is financial markets can be classed as an homogeneous crowd meaning that they share common goals and pressures. Much like in the animal kingdom, herds of water buffalo share the same migration patterns and dietary requirements, in the financial world, investors ”migrate” their investments to profit maximizing strategies and ”feed” on risk minimization tech-niques. Many examples of herding can be seen throughout history, speculative booms arose where investors jumped onto rising commodity prices, ignoring the economic fun-damentals. For example the price of tulips in the 17th century, Florida real estate craze of 1926, the dot.com bubble in the late 1990s , or, more recently, the move away from

(9)

2

traditional energy investments after oil prices dramatically fell. Participants in these markets all interact and imitate each other leading to large fluctuations in aggregate demand [3].

The theoretical work on herd behavior started with the seminal papers of Banerjee (1992) [4] and Bikhchandani et al. (1992) [5]. These papers model herd behavior in an abstract environment in which participants with private information make their decisions in sequence. One would expect normally distributed (Gaussian) results if all decisions were made simultaneously. However, in the sequential decision model, individuals make their decisions one at a time and take into account the decisions of the individuals preceding them. It is found that after a finite number of participants have chosen the same action, the following participants disregard their own private information and imitate their predecessors. Herding by market participants is of growing concern for policy makers because it has a destabilizing effect on the market which potentially can lead to a crisis. This leads us to the quote from John F. Kennedy:

“The Chinese use two brush strokes to write the word crisis. One brush stroke stands for danger; the other for opportunity. In a crisis, be aware of the danger but recognize the opportunity.”

This leads us to question, is it possible to take advantage of the opportunity created from herd behaviour?

Portfolio management is the art and science of making decisions about investment mix and policy, matching investments to objectives, asset allocation for individuals and in-stitutions, and balancing risk against performance. Portfolio management is all about determining strengths, weaknesses, opportunities and threats in the attempt to maxi-mize return at a given appetite for risk.

For a wide range of applications in finance and engineering, it is desirable to make decisions which minimize risk. Risk takes many different forms and there are many different ways to quantify it. Choosing the best method for individual goals is the responsibility of the portfolio manager. The only certainty in investing is it is impossible to consistently predict the winners and losers, so the prudent approach is to create a basket of investments that provide broad exposure within an asset class. Portfolio diversification is used to manage risk of a portfolio of assets. However, it is found that periods of high market stress coincide with high levels of co-movement between stocks, implying that the diversification benefits are less effective when needed most. Having information about today’s level of co-movement may give market participants the opportunity to take necessary cautionary actions. Markowitz portfolio theory is used

(10)

to find optimal solutions to complex portfolio allocation problems using basic mean-variance framework. We will examine the mean-comean-variance framework with the goal of controlling risk for a given level of return and compare it to other well known risk measures.

1.2

Research Question

A group of individuals wishes to select a set of stocks to invest in. In the 2008 finan-cial crisis, the group incurred large losses in their portfolio due to positive correlations between their assets. They recognize the existence of benefits from diversification, but did not understand that co-movement between stock values increases during periods of high market stress and uncertainty. A solution needs to be found that minimizes the co-movement between stocks. As a consequence the group agrees to construct a portfolio that uses diversification with the objective of achieving a specific level of return for a minimized level of co-movement. The focus of this thesis is to compare the HIX as a measure of risk to other well known measures of risk and compare them in the context of market performance.

Linders et al., (2015)[6] reviews various measures of herd behavior using • Historical stock prices

• Implied behavior using option prices

• Implied correlation indices

The disadvantage of these measures is that these methods are not model-free, they are built on the assumption that the dynamics of stock prices is described by a multivariate lognormal distribution. The measures are also based on pairwise correlations which fail to fully capture the degree of herd behavior.

In this paper we will extend the research of Linders et al. and use the concept of comonotonicity to define the herd behaviour index (HIX), an alternative measure for the implied degree of comovement between stocks in a portfolio setting. We will minimize the comovement of stocks by selecting optimal weights, then test the HIX minimization technique in a historical setting.

Roughly stated, random variables (r.v.’s) representing the different stock prices in one month from now are said to be comonotonic in case that they move in perfect unison and behave like a single asset, not allowing for any diversification. In reality, stock prices

(11)

4

will typically not move in a perfectly comonotonic way and as such, a stock market does not exhibit perfect herd behavior. Nevertheless, it is possible to construct the hypothetical comonotonic market situation out of the available stock price data. This artificial comonotonic market situation is then used as a point of reference, allowing us to measure the ”distance” between the real (observed) market situation and the (non-observed) comonotonic extreme case [6].

A sample of ten stocks from the S&P 500 are selected based on the criteria of having a stock price history dating back to 2005. The stock prices and herd behaviour will be studied from the beginning of 2005 to 2015. The efficient frontier, using Markowitz portfolio theory, will be examined using these stocks and optimal weights are found. However, the solution to this weights are not always feasible. It is necessary to numeri-cally find the optimal weights using extra constraints.

This paper is organized as follows: in Chapter 2, we present convex order and its connec-tion with upper and lower tail transforms, as well as the noconnec-tion of comonotonicity. Chap-ter 3 reviews the stock market and behavioural finance. ChapChap-ter 4 outline Markowitz portfolio theory, mean variance optimization and how to make efficient choices. Chapter 5 describes value-at-risk and its minimization technique. In Chapter 6, we explain how to calculate the herd behaviour index and, also, how to minimize the HIX using optimal weights. The results of the minimization techniques and performance of each portfolio is given in Chapter 7 and finally conclusion and discussion are given in Chapter 8.

(12)

Tail Transforms, Convex Order

and Comonotonicity

In this chapter, we present several concepts and results used by Linders et al. (2015) that will be used throughout the paper. [6]

2.1

Tail Transforms

The inverse of the cumulative distribution function (cdf) of a random variable (r.v.) X is denoted by FX−1. It is defined by

FX−1(p) := inf{x ∈ R|Fx(x) ≥ p}, p ∈ [0, 1] (2.1)

where ”:=” means ”is defined by” and with inf ∅ = +∞, by convention. An alternative inverse of FX is defined as follows:

FX−1+(p) := sup{x ∈ R|Fx(x) ≤ p}, p ∈ [0, 1] (2.2)

where sup ∅ = −∞, by convention. The inverse FX−1(p) and FX−1+(p) only differ on horizontal segments of the distribution function FX. Note that

FX−1(p) = −FX−1+(1 − p) (2.3)

For any number α ∈ [0, 1], the alpha inverse FX−1(α) is defined as a linear combination of the inverses of Equation 2.1and 2.2:

FX−1(α):= αFX−1(p) + (1 − α)FX−1+(p), p ∈ (0, 1). (2.4)

(13)

6

All r.v.’s that will be considered in this paper are assumed to have a finite mean. The Tail Value-at-risk at level p, notation TVaRp[X], of a r.v. X is defined as follows:

TVaRp[X] := 1 1 − p Z 1 p FX−1(q)dq, p ∈ (0, 1) (2.5)

TVaRp can be considered as a measure for the upper tail of the cdf FX of X. It is linked

with the upper tail transform E[(X − K)+] for an appropriate choice of K:

TVaRp[X] = FX−1(α)(p)+

1

1 − pE[(X −F

−1(α)

X (p))+], α ∈ (0, 1) and p ∈ (0, 1) (2.6)

see e.g., Dhaene et al. (2006) [7]

The Left Tail Value-at-Risk at level p of X is denoted by LTVaRp[X]. It is defined by

LTVaRp[X] := 1 p Z p 0 FX−1(q)dq, p ∈ (0, 1) (2.7)

LTVaRp can be considered as a measure for the lower tail of the cdf FX. Note that

LTVaRp[X] = −TVaR1−p[−X]. (2.8)

Taking into account Equations 2.3, 2.6 and 2.8, we find the following expression which relates LTVaRp[X] with the lower tail transform E[(X − K)+] for an appropriate choice

of K: LTVaRp[X] = F −1(α) X (p) + 1 pE[(F −1(α) X (p) − X)+], α ∈ (0, 1) and p ∈ (0, 1) (2.9)

Finally, note that TVaRp[X] and LTVaRp[X] are connected by the following expression:

pLTVaRp[X] + (1 − p)TVaRp[X] = E[X], p ∈ (0, 1). (2.10)

2.2

Convex order and comonotonicity

The variability of (the cdf of) two r.v.’s can be compared via the notion of convex order; see e.g., Shaked and Shanthikumar (2007) [8] or Denuit et al. (2005) [9]. Recall that convex order between two r.v.’s X and Y , notation X cxY , is defined by

X cxY ⇐⇒

  

E[(X − K)+] ≤ E[(Y − K)+], for all K ∈ R

E[(K − X)+] ≤ E[(K − Y )+], for all K ∈ R

(14)

Intuitively, relation2.11indicates that Y has larger lower and upper tails than X. This means that X cxY can indeed be interpreted as ”Y is more variable than X”.

Two r.v.’s X and Y can be ordered in convex order in case E[X] = E[Y ]. Moreover, it is straightforward to prove that

X cxY ⇐⇒

  

E[(X − K)+] ≤ E[(Y − K)+], for all K ∈ R

E[X] = E[Y ], for all K ∈ R

(2.12)

Convex order can also be characterized in terms of TVaR’s and LTVaR’s:

X cxY ⇐⇒

  

TVaRp[X] ≤ TVaRp[Y ], for all p ∈ (0, 1)

LTVaRp[Y ] ≤ LTVaRp[X], for all p ∈ (0, 1)

(2.13)

Consider a random vector X = (X1, ..., Xn) with marginal distributions denoted by FXi, i = 1, 2, ..., n. The comonotonic modification of X, notation Xc,is defined by

Xc d= (FX−1

1(U ), ..., F

−1

Xn(U )), (2.14)

where = stands for equality in distribution and U is a r.v. which is uniformly dis-d tributed over the unit interval. The components of the random vector Xc are said to be comonotonic.

The components of the comonotonic random vector (FX−1

1(U ), ..., F

−1

Xn(U )) are maximally dependent in the sense that all of them are non-decreasing functions of the same random variable. Hence the comonotonic random variables are ”common monotonic”. From an economic point of view this means that holding a long position (or a short position) in comonotonic random variables can never lead to a hedge, as the variability of one can never lead to the counter variability of others [10].

Characterization (2.14) shows that a random vector is comonotonic if its components are jointly driven up or down by a single stochastic risk factor. Therefore, we say that Xcexhibits perfect herd behavior. For an introduction to theory of comonotonicity, we refer to Dhaene et al. (2002a) [11]. Financial and actuarial applications are described in Dhaene et al. (2002b) [12]. An updated overview of applications of comonotonicity can be found in Deelstra et al. (2011) [13].

Let us introduce the notations

(15)

8 and Sc:= w1FX−11(U ) + w2F −1 X2(U ) + ... + wnF −1 Xn(U ), (2.16)

for the weighted sum of the components of X and Xc, with deterministic positive weight factors wi. It can be proven that

S ≤cxSc, (2.17)

which means that in the class of all random vectors with given marginal distributions, a comonotonic situation leads to a weighted sum which is largest in convex order.

2.3

Multivariate Dependence Measures

To evaluate aggregate risk in a financial or insurance portfolio, a risk analyst has to calculate the distribution function of a sum of random variable. In this paper we focus on the aggregate risk rather than on the copula or the joint distribution itself. Consider a portfolio X consisting of n risk factors X1, X2, . . . , Xn, then the aggregate risk is

S = X1+ X2+ · · · + Xn. To determine the distribution of this aggregate risk we have to

know the joint distribution of FX of (X1, X2, . . . , Xn). In practice this however turns out

to be very difficult. Modeling the marginal distribution of Xi is quite a common task,

but finding the appropriate copula between Xi is much less straight forward. Moreover,

the calculation of the aggregate distribution involves a n-dimensional integral, which is not very appealing for high dimensional portfolios.

One way to tackle this problem is to simply neglect the dependence and assume that the risks are independent. Let X⊥ = (X1⊥, X2⊥, . . . , Xn⊥) be a random vector with the same marginal distributions as X but with independent components, i.e. X⊥ has a cumulative distribution

FX⊥(x) = FX1(x1)FX2(x2) . . . FXn(xn)

The distribution of S⊥ = (X1⊥, X2⊥, . . . , Xn⊥) can be obtained by the well-known con-volution technique or, for some specific marginal distributions, by Panjer (1981) [14] and others. Obviously, neglecting the (usually positive) dependence, we might under-rate the aggregate risk as S⊥ will usually have a smaller variance. Note however that

E[S] = E[S⊥], as X and X⊥ belong to the same Fr´echet class1.

1A Fr´echet class collects all multivariate joint distribution functions that have the same marginals.

The Fr´echet bound corresponds to the case where the co-dependence structure is chosen so that the underlying assets are co-monotonic. Members of a Fr´echet class only differ with respect to the interde-pendence between their marginals. See Decancq (2011) [15]

(16)

Alternatively, one might consider the strongest positive dependence and assume that risks are co-monotonic. Let Xc = (X1c, X2c, . . . , Xnc) be a random vector with the same marginal distributions as X but with co-monotonic components, i.e. Xchas a cumulative distribution

FXc(x) = min{FX

1(x1), FX2(x2), . . . , FXn(xn)} or, equivalently

Xc=d(FX−11(U ), FX−12(U ), . . . , FX−1n(U )), U ∼ U (0, 1)

where =d denotes equality in distribution.

The distribution of function of Sc= X1c+ X2c+ · · · + Xnc) can be obtained by inverting the quantile function, which in turn equals the sum of the marginal quantile functions FX−1

i. Dhaene et al. (2002) [12] show that S is smaller in convex order than S

c (written

S ≤cxSc), i.e.

E[v(S)] ≤ E[v(Sc)]

for all real convex functions v, provided the expectations exist. This implies that Schas heavier tails than S and V ar[Sc] ≥ V ar[S], so the aggregate risk will likely be overrated. Note that X and Xc also belong to the same Fr´echet class, so E[S] = E[Sc].

In order to choose between both approximations, or perhaps use a weighted average, we should have an indication on the accuracy. Clearly this accuracy will depend on the copula of X , but it is also influenced by the marginal distributions. In Dhaene et al. (2013) [16], a multivariate dependence measure that takes both aspects into account is introduced. This measure differs from other multivariate dependence measures as it focuses on the aggregate risk S rather than on the copula or the joint distribution function of X. In a finance context, it can be translated into a measure for herd behavior, see Dhaene et al. (2012) [17].

2.3.1 Definitions and Properties

Most of the dependence measures proposed in literature are written directly in terms of the copula or the joint distribution function of X. Keeping the aggregate risk in mind, we use Dhaene et al. (2012) [17] to measure the dependence in X indirectly through the distribution of the sum S of its components. More specifically, we focus on the variance of S. As convex order implies ordered variances, we have that V ar(S) ≤ V ar(Sc). This

suggests the following multivariate dependence measure.

Definition 2.1. The dependence measure ρcof a random vector X with non-degenerate

(17)

10 ρc(X) = V ar(S) − V ar(S⊥) V ar(Sc) − V ar(S)⊥ = Pn i=1 P j<iCov(Xi, Xj) Pn i=1 P j<iCov(Xic, Xjc) (2.18)

provided that the covariances exist.

The first expression in Equation 2.18 is centered around the independent vector and normalized with respect to the comonotonic vector. From the second expression we see that ρc can be interpreted as a normalized average of bivariate covariances. Since the

numerator cannot exceed the denominator, ρc is bounded from above by 1. Without

imposing some restrictions on X however, there is no general lower bound.

The condition of non-degenerate margins ensures that the denominator in Equation2.18

is non-zero. Dhaene et al. (2012)[17] prove this assertion and extend a result of Luan (2001) [18] for positive random variables to real-valued random variables.

Lemma 2.2. The two random variables X and Y are both independent and comonotonic if and only if at least one of them is degenerate.

Proof. First, assume Y is degenerate with value a, i.e. P(Y = a) = 1 and P(Y 6= a) = 0. Then, FX,Y(x, y) = P(X ≤ x, Y ≤ y) =    0, y < a FX(x), y ≥ a and

min(FX(x), FY(y)) = FX(x)FY(y) =

  

0, y < a

FX(x), y ≥ a

so X and Y are both independent and comonotonic

Conversely, assume that X and Y are both independent and comonotonic. Without loss of generality, assume that X is non-degenerate. Hence, there is a least one value x for which 0 < FX(x) < 1. Since X and Y are independent and comonotonic, we have

FX,Y(x, y) = min(FX(x), FY(y)) = FX(x)FY(y), ∀ x, y

For fixed x with 0 < FX(x) < 1, FY(y) < FX(x) then implies FY(y) = FX(x)FY(y)

and thus FY(y) = 0 because FX(x) 6= 1. On the other hand, FY(y) > FX(x) implies

FX(x) = FX(x)FY(y) and thus FY(y) = 1, because FX(x) 6= 0. The third case FY(y) =

FX(x) would imply FX(x) = (FX(x))2 which contradicts 0 < FX(x) < 1. Consequently,

FY(y) is either 0 or 1 and thus Y is a degenerate random variable.

Definition 2.3. A random couple (X, Y ) is said to be positively quadrant dependent (PQD) if

(18)

P(X ≤ x, Y ≤ y) ≥ P(X ≤ x)P(Y ≤ y), for all (x1, x2, . . . xn) ∈ R2

This bivariate notion of dependence can be generalized to higher dimensions by defining positive orthant dependence, see Denuit et al. (2005) [9]

Definition 2.4. A random vector (X1, X2, . . . , Xn) is said to be positively lower orthant

dependent (PLOD) if P(X1≤ x1, . . . , Xn≤ xn) ≥ n Y i=1 P(Xi≤ xi), for all(x1, x2, . . . , xd) ∈ Rn (2.19)

It is said to positively upper orthant dependent (PUOD) if

P(X1> x1, . . . , Xn> xn) ≥ n

Y

i=1

P(Xi> xi), for all(x1, x2, . . . , xd) ∈ Rn (2.20)

When both 5.2 and 5.3 hold, the random vector is called positively orthant dependent (POD).

From Dhaene et al. (2005) we have that for any two random variables X and Y with joint distribution FX,Y and marginal distributions FX and FY,

Cov(X, Y ) = Z ∞ −∞ Z ∞ −∞ (FX,Y(x, y) − FX(x)FY(y))dxdy. (2.21)

This implies that Cov(Xi⊥, Xj⊥) ≤ Cov(Xic, Xjc), ∀i, j. Since

n X i=1 n X j=1 Cov(Xi⊥, Xj⊥) = n X i=1 n X j=1 Cov(Xic, Xjc)

we have that Cov(Xic, Xjc) = Cov(Xi⊥, Xj⊥) for all i and j, and thus Cov(Xic, Xjc) = 0 for all i 6= j. From lemma 3 in Lehmann (1966) [19] we know that random variables that are PQD and uncorrelated are independent. Clearly, the couple (Xic, Xjc) is PQD, so Cov(Xic, Xjc) = 0 implies that Xicand Xjc(i 6= j) are both comonotonic and independent. Lemma 5.2 then ensures that if Xi is non-degenerate for fixed i, all Xj with j 6= i are

degenerate.

Dhaene’s dependence measure satisfies the axioms of normalization, monotonicty, per-mutation invariance and duality in Taylor (2006) [20].

(19)

Chapter 3

The Stock Market

3.1

The stock market index

We consider a financial market1 where n (dividend or non-dividend paying) stocks, labeled from 1 to n, are traded. Suppose that current time is t = 0. The price a time t, 0 ≤ t ≤ T ≤ +∞, of stock i is denoted by Xi(t). Unless otherwise stated, we will always

silently assume the Xi(t) ≥ 0 for all i and that its first and second orders moments are

finite. Apart from the stocks, there is a stock market index of which the price is a linear combination of the prices of the n underlying stocks. Denoting the price of the index at time t by S(t), 0 ≤ t ≤ T , we have that

S(t) = w1X1(t) + w2X2(t) + · · · + wnXn(t), (3.1)

where wi, i = 1, 2, . . . , n, are positive weights that are fixed up front.

3.2

Behavioral Finance

”Even the smartest people are affected by psychological biases, but traditional finance has considered this irrelevant.” - Nofsinger, 2005

Behavioral finance is a field of finance that proposes psychology-based theories to explain stock market anomalies such as severe rises or falls in stock price. In efficient markets, securities prices can deviate from their rational levels and be based on biased estimates of intrinsic value. Behavioral finance can help not only how investors behave and how markets function but also how improvement can occur. Baker and Nosfinger (2010) [21]

1

We use the common approach to describe the financial market via a filtered probability space (Ω, F , (Ft)0≤ t ≤ T, P)

(20)

provide a comprehensive discussion of behavioral economics. Within behavioral finance, it is assumed the information structure and the characteristics of market participants systematically influence individuals’ investment decisions as well as market outcomes.

3.2.1 Prospect Theory

In 1979, Kahneman and Tversky presented a critique of expected utility theory which was [22]. Their alternative model for decision making under risk is now commonly known as prospect theory. Their paper details how choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory is required, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low probabilities. Over-weighting of low probabilities may contribute may contribute to the attractiveness of both insurance and gambling.

Example: offer someone a choice of¤50 with a probability of 1 or, on the flip of a coin, the possibility of winning ¤100 or winning nothing (probability of 50%). The chances are that the person will choose the sure thing,¤50. Conversely, offer a choice of a sure loss of¤50 or, on a flip of a coin, a loss of ¤100 or nothing. The person will probably take some more time to cinsider this, then choose to take the coin toss. The chance of the coin flipping either way is equivalent for both scenarios, yet people will go for the coin toss to save themselves from loss even though the coin flip could mean an even greater loss. People tend to view the possibility of recouping a loss as more important than the possibility of greater gain.

The simple Wall Street phrase ”cut your losses short and let your winners run” is widely acknowledged yet not as often practiced by the average investor. In fact, the opposite is observed by which investors choose to cash in on small successes while their value is rising, and hold on to small losses while the value is falling further. The most successful of investors have incurred losses. Losses are inevitable and unavoidable to gamblers or

(21)

14

investors. Therefore, the priority is to avoid losses, with that, the objective should be to minimize losses. Investors should manage their risk to minimize their losses.

3.2.2 Prospect Theory in Stock Markets

One of the most infamous financial events in recent memory would be the bursting of the internet bubble. However, this wasn’t the first time that events like this have happened in the markets. How could something so catastrophic be allowed to happen over and over again? The answer to this question can be found in what some people believe to be a hardwired human attribute: herd behavior, which is the tendency for individuals to mimic the actions (rational or irrational) of a larger group. Individually, however, most people would not necessarily make the same choice.

There are a couple of reasons why herd behavior happens. The first is the social pressure of conformity. You probably know from experience that this can be a powerful force. This is because most people are very sociable and have a natural desire to be accepted by a group, rather than be branded as an outcast. Therefore, following the group is an ideal way of becoming a member.

The second reason is the common rationale that it’s unlikely that such a large group could be wrong. After all, even if you are convinced that a particular idea or course or action is irrational or incorrect, you might still follow the herd, believing they know something that you don’t (information asymmetry). This is especially prevalent in situations in which an individual has very little experience.

Herd behavior was exhibited in the late 1990s as venture capitalists and private investors were frantically investing huge amounts of money into internet-related companies, even though most of these dot-com’s did not (at the time) have financially sound business models. The driving force that seemed to compel these investors to sink their money into such an uncertain venture was the reassurance they got from seeing so many others do the same thing.

A strong herd mentality effects financial professionals. The ultimate goal of a money manager is to follow an investment strategy to maximize a client’s invested wealth. The problem lies in the amount of scrutiny that money managers receive from their clients whenever a new investment fad pops up. For example, a wealthy client may have heard about a new investment gimmick that’s gaining notoriety and inquires about whether the money manager employs a similar ”strategy”.

In many cases, it’s tempting for a money manager to follow the herd of investment professionals. After all, if the aforementioned gimmick pans out, his clients will be

(22)

happy. If it doesn’t, that money manager can justify his poor decision by pointing out just how many others were led astray.

Herd behavior, as the Dot-com bubble illustrates, is usually not a very profitable invest-ment strategy. Investors that employ a herd-invest-mentality investinvest-ment strategy constantly buy and sell their investment assets in pursuit of the newest and hottest investment trends. For example, if a herd investor hears that internet stocks are the best invest-ments right now, he will free up his investment capital and then dump it on internet stocks. If biotech stocks are all the rage six months later, he’ll probably move his money again, perhaps before he has even experienced significant appreciation in his internet investments.

Keep in mind that all this frequent buying and selling incurs a substantial amount of transaction costs, which can erode available profits. Furthermore, it’s extremely difficult to time trades correctly to ensure that you are entering your position right when the trend is starting. By the time a herd investor knows about the newest trend, most other investors have already taken advantage of this news, and the strategy’s wealth-maximizing potential has probably already peaked. This means that many herd-following investors will probably be entering into the game too late and are likely to lose money as those at the front of the pack move on to other strategies.

While it’s tempting to follow the newest investment trends, an investor is generally better off steering clear of the herd. Just because everyone is jumping on a certain investment ”bandwagon” doesn’t necessarily mean the strategy is correct. Therefore, the soundest advice is to always do your research before following any trend. It is important to remember that particular investments favored by the herd can easily become overvalued because the investment’s high values are usually based on optimism and not on the underlying fundamentals.

3.2.3 Systemic Risk

Systemic risk refers to the possibility of a collapse of an entire financial system or market, differing from the risk associated with any particular individual or a group pertaining to the system, which may include banks, government, brokers, and creditors. This type of risk is both unpredictable and and impossible to completely avoid. It cannot be mitigated through diversification, only through hedging or by using the right asset allocation strategy. After the 2008 financial crisis, a significant amount of effort has been directed to the study of systemic risk and its consequences around the world.

(23)

Chapter 4

Portfolio Theory

4.1

Modern Portfolio Theory

”Don’t put all your eggs in one basket ” is a long standing expression amongst the port-folio manager community, however, it is unknown which came first: diversification tech-niques or the famed expression - a classic ”chicken or the egg” conundrum. Portfolio management is an ongoing process of constructing portfolios that balance the objectives of the portfolio manager and the investor (minimize risk for a certain level of return). By constructing a diversified portfolio, a portfolio manager can reduce the risk for a given level of expected return, compared to investing in a single asset or security thus reducing the exposure of the portfolio. The basic argument follows from portfolio theory: joining together two less than perfectly correlated income streams reduces the relative variability of the streams.

Diversification is the concept that one can reduce total risk without sacrificing possible returns by investing in more than one asset. A financial institution, be it an insurance company or a bank, faces a multitude of risks that lead to potential loss. However, not all risks the financial institution is facing will suffer adverse losses at the same time. Some areas of business may experience adverse financial losses whilst others average losses, or even profits. For example, suppose a farmer decides to invest entirely in a poultry farm. The farm sells free-range, organic chicken to the local butchers and supermarkets. Unexpectedly, an outbreak of bird-flu sweeps the farm and wipes out the entire flock. Profits dry up and e farmer experiences financial ruin. Now, consider before the invest-ment decision, this time the farmer chooses to invest half of his wealth in the poultry farm and the other half in a herd of cattle which offers an equal return on my invest-ment. The inevitable bird-flu may have wiped out the demand for poultry but the herd of cattle are immune to this and remain healthy. The investment in the latter scenario

(24)

is exposed to less risk by diversifying investment between the poultry and the cattle. By investing in more than one asset, the investment is less exposed to “asset specific” risks. The magnitude of the effect of bird-flu on the return is reduced. Diversification can reduce volatility without necessarily reducing expected return. It is important to remember that there also exist risks that cannot be mitigated by diversification. For example, if the EU ceases a to offer agriculture subsidies to the farmer his costs will increase and profits decrease. Similarly a rise in interest rates would affect all businesses as they all save or spend money. It’s possible to conceptually divide all risk in to two categories: diversifiable and non-diversifiable risk (also known as “systematic risk” or “market risk” as mentioned in Section3.2.3).

Considering this, let’s assume goal of a business is to maximize wealth. Hence, decision-making in portfolio theory should only be evaluated on expected return form investment but, more directly, how they affect the amount and uncertainty of the future cash flow stream accruing to the owners.[23]

4.2

Optimal Portfolio Design:

Basic Markowitz Mean-Variance Framework

The basic mean-variance framework assumes a single investment period, a frictionless market, risky assets with normally distributed returns, and possibly a risk-free asset whose expected return is known ex-ante1. In this framework, which follows from Wilmott (2007) [24], one can derive efficient sets agreed upon by all investors, sharing the same information, and investor-specific optimal portfolios depending on each investor’s risk preference.

Assume that asset returns are jointly normally distributed. The normal distribution has the property that the distribution can be completely described by two parameters: the mean and variance. The multivariate normal can be completely described by the mean vector, variance vector and the correlation matrix.

For a portfolio of N assets, the value today of the ith asset is Xi and its random return

is Ri over our time horizon T . Returns are independent and normally distributed with

mean µiT and standard deviation σi

T . The correlation between returns of the ith and jth asset is ρij (with ρii= 1). The parameters µ, σ and ρ 2 correspond to the expected

return, volatility and correlation. Note the scaling with the time horizon.

1In the financial world, the ex-ante return is the expected return of an investment portfolio. 2

µ is a N -length vector containing the marginal returns for each individual asset. Similarly, σ is a N -length vector containing the volatility of each individual asset. ρ is an N xN matrix containing the correlations of each asset

(25)

18

For weight wi of the ith asset, then our portfolio, S, has value

S =

N

X

i=1

wiXi.

At the end of our time horizon the value is

S + δS =

N

X

i=1

wiXi(1 + Ri),

for time period δ.

Write the relative change in portfolio value as

δS S = N X i=1 WiRi (4.1) where Wi= wiXi PN i=1wiXi .

Note: The weights Wi sum to 1 for i = 1...N .

From (4.1) it is simple to calculate the expected return on the portfolio

µS= 1 TE  δS S  = N X i=1 Wiµi (4.2)

and the standard deviation of the return

σS = 1 √ T s var δS S  = v u u t N X i=1 N X j=1 WiWjρijσiσj. (4.3)

Here, we have related the parameters for the individual assets to the expected return and the standard deviation of the entire portfolio through the correlation parameters.

4.2.1 Independent Assets

Suppose assets in a portfolio are uncorrelated, ρij = 0, i 6= j. To make things simple

assume that they are equally weighted so that Wi = N1. The expected return on the

(26)

µS = 1 N N X i=1 µi,

the average of the expected returns on all the assets, and the volatility becomes

σS = v u u t 1 N2 N X i=1 σ2i.

This volatility is O(N−12) since there are N terms in the sum. As we increase the number of assets in the portfolio, the standard deviation of the returns tends to zero. From here forth, volatility or standard deviation will be referred to as risk, something to be reduced (within reason), and the expected return as reward, something to be maximised.

4.2.2 Dependent Assets

The effects of correlation on variance explained using Equation (4.3) is numerically demonstrated below.

Consider the variances of assets i and j are equal, i.e. σi = σj = σ and the portfolio

weights are equal, i.e. wi = wj = 0.5.

Case 1: Perfect positive correlation ρ = 1

σ2S= 0.25σ2+ 0.5σ2+ 0.25σ2= σ2

This demonstrates that portfolio variance is the same as the variance for each asset. Diversification does not reduce the portfolio variance

Case 2: No correlation ρ = 0

σS2 = 0.25σ2+ 0.25σ2 = 0.5σ2

This results demonstrates that portfolio variance is half the variance of the individual assets. So combining stocks that have less than perfect positive correlation is a strategy that will reduce the variance of the returns on your portfolio. This is called diversifica-tion.

(27)

20

σS2 = 0.25σ2− 0.5σ2+ 0.25σ2 = 0

These assets create a perfect hedge. This shows that diversification can be thought of as a partial hedge of risks.

Increasing the number of less than perfectly correlated stocks in a portfolio reduces the standard deviation of the portfolio decreases.[25]

By using this framework it’s is possible to discuss the ‘best’ portfolio.

4.3

The Eficient Frontier

4.3.1 A Simple Example

The definition of ‘best’ was addressed very successfully by Nobel Laureate Harry Markowitz. His model provides a way of defining portfolios that are efficient. An efficient portfolio is one that has the highest reward for a given level of risk, or the lowest risk for a given reward. To see how this works imagine that there are five assets in the world, A, B, C, D and E, illustrated in Figure (4.1). The question is, if one is limited to buying a single asset, which asset is the best choice?

Figure 4.1: Risk and Return in a hypothetical five asset world

The choice between A, B, C, D is not clear. Consider A and E, both have an equal return but A is less risky. A risk-averse investor will always choose A instead of E, so A

(28)

dominates E. Similarly, D dominates E, as D has a greater return for an equal level of risk. B and C both dominate D with greater returns and less risk. So in a risk-averse five asset universe, a rational investor will never choose to buy asset E. It’s not possible to objectively say which from of A, B, C and D is the better; this is a subjective choice and depends on an investor’s risk preferences.

Consider assets A and D of Figure (4.1), a combination of these assets is allowed in a portfolio, what effect does this have on risk and reward?

From (4.2) and (4.3) we have

µS = W µA+ (1 − W )µD (4.4)

and

σS2 = W2σA2 + 2W (1 − W )ρA,DσAσD+ (1 − W )2σ2D. (4.5)

Figure 4.2: Efficient frontier of assets A and D. The upper part of the hyperbola from point A to D is known as the efficient frontier. Any combination of assets on this line is preferable to the rest of the curve. Again, an individual’s risk preferences will

determine where to be on the curve.

Suppose there is an opportunity to invest in a two asset portfolio for one year. There is no risk-free assets and wealth must be entirely invested in a combination of the two assets. W is the percentage of wealth that is assigned to investing in asset A and, remembering that the weights must sum to one, the percentage of wealth assigned to asset D is 1 − W . µS, µA and µD is the expected annualized return of the portfolio

S, asset A and asset D respectively. σS2, σ2A and σD2 is the annualized variance of the portfolio S, asset A and asset D respectively. Finally, ρA,D is the correlation between

(29)

22 WA WD µP σP 0.0 1.0 0.15 0.2500 0.1 0.9 0.139 0.2262 0.2 0.8 0.128 0.2026 0.3 0.7 0.117 0.1794 0.4 0.6 0.106 0.1567 0.5 0.5 0.095 0.1348 0.6 0.4 0.084 0.1141 0.7 0.3 0.073 0.0955 0.8 0.2 0.062 0.0805 0.9 0.1 0.051 0.0712 1.0 0.0 0.04 0.0700

Table 4.1: Two asset portfolio with varying weights. These data points are repre-sented by the line joining A to D in Figure4.2.

As the weights of possible portfolios vary, the risk and return also changes. The line in risk/reward space that is parameterized by W is a hyperbola, as shown in Figure (4.2). When one of the volatilities is zero the line becomes straight. Anywhere on the curve between the two points requires a long position in each asset. Outside this region, one of the assets is sold short to finance the purchase of the other. Everything that follows assumes that we can sell short as much of an asset without a limit. The results change slightly when there are restrictions.

Figure 4.3: Varying weight combinations of asset A and D are mapped on the efficient frontier. This determines the shape of the efficient frontier. The curve displays a mir-roring portfolio below in blue with less returns for the same level of standard deviation. It is clear that this is dominated by the pink part of the curve. We disregard the blue part of the curve. In reality, there is an infinite number of combinations of the two

(30)

4.3.2 Risk-free rate and leverage

r a portfolio with more than two assets, there no longer is a simple hyperbola for possible risk/reward profiles; this is replaced by the green frontier in Figure (4.4). This figure now uses all of A, B, C and D, not just the A and D. Even though B and C are not individually appealing they may be useful in a portfolio, depending how they correlate, or not, with other investments. In this figure the efficient frontier without short-selling is marked in green. Given any choice of portfolio the optimal choice is to hold a position that lies on this efficient frontier. A new efficient frontier is marked in blue which allows for short selling. The curve stretches to the left indicating less risk for the same return. In red, there is another efficient frontier which incorporates the risk-free rate and short selling. The point where the red line is tangent to the blue (short-selling) frontier is known as the market portfolio .

A risk-free investment earning a guaranteed rate of return Rf is the point R in Figure

(4.4). If we are allowed to hold this asset in our portfolio, since the volatility of this asset is zero, we get the new efficient frontier (red line), known as the capital market line (CML). The point of tangency corresponds to a portfolio on the efficient frontier, known as the market portfolio3.

Figure 4.4: Effect of short sales restrictions on the efficient frontier. The figure shows two efficient frontiers for a set of four stocks that are less than perfectly correlated. The capital market line, in red, is a new efficient frontier that includes a risk-free return of

R.

3

(31)

24

A risk-free asset is defined as an investment that has no capital loss over a predetermined period and the the risk-free return Rf is the return that is earned on such an investment.

The excess return is the difference between the return on a portfolio RSand the risk-free

return Rf

RS− Rf.

If we consider a relative return analysis the return is measured against a reference port-folio. Typically, risky assets are equities and Modern Portfolio Theory (MPT) originates from research in to equities [26].

James Tobin (1958) [27] introduced the idea of leverage to portfolio theory by incorpo-rating into the analysis an asset which pays a risk-free rate. By combining a risk-free asset with a portfolio on the efficient frontier, it is possible to construct portfolios whose risk–return profiles are superior to those of portfolios on the efficient frontier.

Using the risk-free asset, investors who hold the market portfolio may: leverage their position by shorting the risk-free asset and investing the proceeds in additional holdings in the market portfolio, or deleverage their position by selling some of their holdings in the super-efficient portfolio and investing the proceeds in the risk-free asset.

The resulting portfolios have risk-reward profiles which all fall on the capital market line. Accordingly, portfolios which combine the risk-free asset with the market portfolio are superior from a risk-reward standpoint to the portfolios on the efficient frontier.[28] Tobin concluded that portfolio construction should be a two-step process. First, in-vestors should determine the market portfolio. This should comprise the risky portion of their portfolio. Next, they should leverage or deleverage the super-efficient portfolio to achieve whatever level of risk they desire. Significantly, the composition of the super-efficient portfolio is independent of the investor’s appetite for risk. The two decisions:

• the composition of the risky portion of the investor’s portfolio, and

• the amount of leverage to use,

are entirely independent of one another. One decision has no effect on the other. This is called Tobin’s separation theorem.

The blue hyperbola that we see in Figure4.5is the efficient frontier that allows for short sales4. As we vary the weights of our investment in each asset the risk and reward also 4A short position is the sale of a borrowed security, commodity or currency with the expectation that

the asset will fall in value. A long position is the opposite of a short position. It is the buying of a security with the expectation the asset will rise in value.

(32)

Figure 4.5: Effect of short sales restrictions on the efficient frontier. The figure shows three efficient frontiers. The returns and standard deviations are estimated on annual basis. The green (lower) frontier is constrained by a ”no short sales restriction”. The blue (higher) frontier allows short sales. The red Capital market line is the efficient

frontier which includes borrowing at the risk free rate R.

changes. Given any choice of combination of asset weights including the short-selling of the risk-free asset we would choose to hold one that lies on this (blue) efficient frontier.

4.4

Mean-Variance Optimization for the long only

portfo-lio problem

Consider a portfolio with positions of wi ≥ 0 dollars in asset i for a set of n assets with

centered returns Ri at the investment horizon and total returns ˜Ri with portfolio return

˜

RΠ. We suppose a total investment of unity and hence for the long only problem we

have the constraint

An optimization problem is one in where the goal is to find the “best possible value” that a function, f, can take subject to a number of constraints. This involves finding the minimum or maximum of f or of a function built around f.

Optimization techniques are very often used in finance to solve a wide array of problems ranging from calculating the value of a bond yield, for valuing derivatives or in our case to solving a portfolio selection problems.

(33)

26 min x1,...,xn f (x1, . . . , xn) (4.6) subject to: g1(x1, . . . , xn) .. . gm(x1, . . . , xn)          ≤ = ≥          b1 .. . bm

The function f is the objective function. It is the function we want to optimize (here minimize). The variables x1, . . . , xn are the decision variables with respect to which we

want to optimize the function. The functions g1, . . . , gm are the m constraints faced in

our optimization. These constraints can be equalities or inequalities.

From standard calculus, we know that:

• the gradient (vector of derivatives) of f at x∗, denoted by f0(x∗), must be zero. This is a necessary condition, but it is not sufficient as minima, maxima and inflection points all have a derivative reaching 0.

• the Hessian (i.e. matrix of second derivatives) of f at x∗, denoted by f00(x∗), must be positive definite (negative definite for a maximization). This is a sufficient condition.

These two conditions are fundamental in optimization. The first condition is used to find a set of potential solutions and the second is used to check which of these answers satisfy the problem. They are referred to as first order (necessary) condition and second order (sufficient) condition.

This leads to mean-variance optimization continued from Section 4.2. In an economy with n assets, each asset i is entirely characterized by its expected return µiand expected

standard deviation σi. In addition, assets i and j are correlated with correlation ρij.

The proportion of the portfolio invested in asset i is wi.

The vector of asset expected returns µ is defined as:

(34)

The covariance matrix Σ is given by: Σ =        σ12 ρ12σ1σ2 . . . ρ1nσ1σn ρ21σ2σ1 σ22 . . . ρ2nσ2σn .. . ... . .. ... ρn1σnσ1 . . . σ2n       

The vector of asset weights is

W = (w1, . . . , wi, . . . wn)0.

4.4.1 Risk Minimization for N Risky Assets

The portfolio selection problem is generally defined as a minimization of risk subject to a return constraint. Two reasons for this convention are:

• a return objective seems intuitively easier to formulate than a risk objective;

• risks are easier to control than returns;

If we adopt this convention, our objective function is the portfolio variance, and we will minimize it with respects to the portfolio weights. Actually, instead of using the portfolio variance, we will use a little trick and scale it down by a factor of 1/2 to ease our calculations. Since the factor is positive, it does not affect the value of the optimal vector of weights w∗. min w 1 2σ 2 π = 1 2w 0 Σw (4.7)

Now for the constraints.

• The portfolio return must be equal to a prespecified level m.

µπ = µ0w = w0µ = m

• Budget Equation: the sum of all weights must equal 1. Since there is no risk-free assets, our wealth must be entirely invested in a combination of the n assets.

(35)

28

where 1 is a n-element unit vector:

(1 . . . 1 . . . 1)0

Summing it all, formulate the portfolio selection problem as

min w 1 2w 0 Σw Subject to: w0µ = m w01 = 1

This problem is an optimization with equality constraints. It is solved using the method of Lagrange.

Form the Lagrange function with two Lagrange multipliers λ and γ:

L(w, λ, γ) = 1 2w

0

Σw + λ(m − w0µ) + γ(1 − w01)

Next, we solve for the first order condition by taking the derivative with respect to the vector w:

∂L

∂w(w, λ, γ) = Σw − λµ − γ1 = 0 (4.8)

Checking the second order condition, the Hessian of the objective function is equal to the covariance matrix Σ, which is positive definite. Therefore, we have reached the optimal weight vector w∗:

w∗= Σ−1(λµ + γ1) (4.9)

To get to this relationship, we have premultiplied4.8 by the inverse matrix Σ−1 . Now we need to find values for λ and γ and then substitute them in to4.9. Recall the constraints:

w0µ = µ0w = m

(36)

Substituting w∗ into these two equations, we get:

µ0Σ−1(λµ + γ1) = λµ0Σ−1µ + γµ0Σ−11 = m 10Σ−1(λµ + γ1) = λ10Σ−1µ + γ10Σ−11 = 1

For convenience, we define the following scalars:

A = 10Σ−11

B = µ0Σ−11 = 10Σ−1µ

C = µ0Σ−1µ

Note that AC − B2 > 0.

The previous system of equations for the Lagrange multipliers becomes

λ = Am − B AC − B2,

γ = C − Bm AC − B2.

(4.10)

Now we need to do is to substitute these values back into4.9 to obtain w∗.

4.5

Drawback of Modern Portfolio Theory:

Dimensional-ity and parameter estimation

The inputs to the Markowitz model are expected returns, volatilities and correlations. With N assets this means N + N + N (N − 1)/2 parameters. In practice, most of these cannot be known accurately (do they even exist?); only the volatilities are at all reliable. Having input these parameters, we must optimize over all weights of assets in the portfolio: Choose a portfolio risk and find the weights that make the return on the portfolio a maximum subject to this volatility. This is a very time-consuming process computationally unless one only has a small number of assets.

(37)

Chapter 5

Value at Risk

”Value-at-Risk” (VaR) measures the worst expected loss over a given horizon under normal market conditions at a given confidence level.

For example, the VaR of a portfolio is $5 million at a 95% confidence level with a target horizon of one year. This means that the is a 5 in 100 chance that the portfolio will lose over $5 million within one year under normal market conditions. We can write this as

Prob[δV ≤ −$5m] = 0.05

whereδV is the change in the portfolios value.

Alternatively, we can write the above equation as:

Prob[δV ≤ −VaR] = 1 − c

where the degree of confidence is c.

It’s important to clearly state normal market conditions, this implies that market ex-treme conditions such as crshes ar not considered or examined separately.

One method of estimating the VaR is to assume that the distribution of returns of the portfolio follows a Normal Distribution. Hence, we can multiply the standard deviation of the returns by the 95th percentile of the standard normal distribution and subtract

this from the mean of the returns.

(38)

5.1

Measuring VaR for a portfolio

If we know the volatilities of all the assets in our portfolio and the correlations between them then we can calculate the VaR for the whole portfolio.

If the volatility of the ith asset is σi and the correlation between the ith and jth asset is

ρij (with ρii = 1), then the VaR for the portfolio consisting of M assets with a holding

of wi of the ith asset is

V aRc= −α(1 − c)δt 1 2 v u u t N X j=1 N X i=1 wiwjρijSiSj (5.1)

where α(.) is the inverse cumulative distribution function for the standardized Normal distribution.

VaR is also favourable as it looks at downside risk. Hence, unlike the variance of a portfolio, it is not impacted by high returns.

5.2

Minimizing VaR for N Risky Assets

Rather than just calculating the VaR of a portfolio, we wish to use the VaR formulation as the objective function and aim to minimize it with respect to a portfolio of stocks as we did in Equation5.2. min w1,...,wn V aR(w1, . . . , wn) (5.2) subject to: g1(w1, . . . , wn) .. . gm(w1, . . . , wn)          ≤ = ≥          b1 .. . bm

For the long-only portfolio selection problem the constraints are as follows.

The vector of asset expected returns µ is defined as:

(39)

32

The covariance matrix Σ is given by:

Σ =        σ12 ρ12σ1σ2 . . . ρ1nσ1σn ρ21σ2σ1 σ22 . . . ρ2nσ2σn .. . ... . .. ... ρn1σnσ1 . . . σ2n       

The vector of asset weights is

(40)

Herd Behavior Index

6.1

Creating a comonotonic sample

In this section we use some comments from Dhaene et al. (2013) [29] on the estimation of ρc from a sample of X. This will be used to estimate the dependence in a dataset,

it also provide a computationally convenient way to calculate ρc when the (co)variances

are hard to find. In that case one could try to generate a sample from X and estimate ρc from that sample.

A straightforward way to estimate ρc is to replace the variances in Equation (2.18) by

their sample version. Consider a d-dimensional sample {(xi1, . . . , xid)}i=1,...,n of size n

and denote ¯ xj = 1 n n X i=1 xij, j = 1, . . . , d si = n X j=1 xij, i = 1, . . . , n ¯ s = 1 n n X i=1 si= 1 n n X i=1 d X j=1 xij = d X j=1 ¯ xj

Var(S) and Var(S⊥) can then be estimated by

1 n − 1 n X i=1 (si− ¯s)2 33

(41)

34 and d X j=1 1 n − 1 n X i=1 (xij− ¯xj)2

respectively, since Var(S⊥) =Pd

j=1V ar(Xj).

For the estimation of Var(Sc) we need a sample of Sc or, alternatively, of Xc. Dhaene et al. (2002b) [11] show that for any x and y in the range of a comonotonic vector either x ≤ y or y ≤ x holds. In other words, all possible outcomes of Xc are ordered componentwise. As X and Xcalso have the same marginal distributions, we can easily turn the sample of X into a sample of Xc . Denote the i-th order statistic of X

j

by x(i)j, we find the following sample of Xc: {(x(i)1, . . . , x(i)d)}i=1,...,n. Accordingly,

{(x(i)1+ · · · + x(i)d)}i=1,...,n constitutes a sample of Sc.

This sample also follows from the additivity of the quantile function for comonotonic variables FS−1c(p) = d X j=1 FX−1 j(p), p ∈ (0, 1).

Replacing the quantile function by its empirical counterpart and setting p = i−0.5n we find s(i)= ˆFS−1c i − 0.5 n  = d X j=1 ˆ FX−1 j i − 0.5 n  = d X j=1 x(i)j, i = 1, . . . , n. Since ¯ sc= 1 n n X i=1 d X j=1 x(i)j = 1 n n X i=1 d X j=1 xij = ¯s,

Var(Sc) can thus be estimated by

1 n − 1 n X i=1 d X j=1 x(i)j − ¯s2 .

Using Var(S) Var(Sc) we can estimate the herd behavior index as

HIXS =

Var(S)

(42)

Summarizing, we have the following estimator for ρc: ˆ ρc= Pn i=1 h Pd j=1xij− ¯s 2 −Pd j=1 xij − ¯xj 2i Pn i=1 h Pd j=1x(i)j− ¯s 2 −Pd j=1 xij− ¯xj 2i

6.2

Finding the Market HIX

There is an implied level of co-movement between stocks in the market. A sample of the market is created using adjusted closing prices from a subset of 10 stocks from the S&P 500 - AAPL, AMZN, COST, FDX, HPQ, IPG, JNJ, LOW, MCD, MSFT. The stocks were chosen from various sectors such as information technology, consumer discretionary goods, consumer staple goods, health care and air freight and logistics. Each stock has a price history beginning before 2005. This reduced market model makes estimations and numerical analysis easier to compute. Results can be forecast on to a larger scale.

We assume lognormal returns for our stocks. In probability, a lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution.

To measure the sum of a log normal distribution we use Equation6.2. Covariance matrix: V ar(S) = n X i=1 n X j=1 wiwjSi,t=0Sj,t=0e(µi+µj+ 1 2(σ 2 iσj2))T(e(ρi,jσiσjT )− 1) (6.2)

Comonotonic Variance (V ar(SC)) is as above but with ρ = 1 ∀ i = 1, ..., n, j = 1, .., n

Combine Equation6.1 with Equation 6.2to find measure market HIX.

6.3

HIX Minimization for N Risky Assets

Similar to our risk minimization in Section 4.4.1, the portfolio selection problem is defined as a minimization of the herd behavior index subject to a return constraint. Two reasons for this convention are:

• a return objective seems intuitively easier to formulate than a risk objective;

(43)

36

Figure 6.1: The market co-movement parameter for HIX is calibrated daily in the five asset economy based on the history of the previous thirty days of returns across stocks. The graph displays the implied market HIX from February 15, 2005 to December 31, 2014. In 2011, the HIX reaches its peak of 0.8192 and lowest level of 0.2097 in 2013. Average level of HIX is 0.4874. The HIX is measured using the last 30 days of stock

movement. This is an arbitrary choice, more data can also be used.

If we adopt this convention, our objective function is the portfolio variance divided be the comonotonic portfolio variance, and we will minimize it with respects to the portfolio weights.

HIX[T ] is based on an approximation of the following ratio:

ratio = V ar[S] V ar[Sc] =

ΣS

ΣSc .

Again, we will use a little trick and scale it down by a factor of 1/2 to ease our calcula-tions. Since the factor is positive, it does not affect the value of the optimal vector of weights w∗. min w 1 2HIX = 1 2minw V ar[S] V ar[Sc]

The constraints are as in Section 4.4.1.

Summing it all, we formulate the portfolio selection problem as

min w 1 2HIX = minw w 0 ΣS ΣSc w

(44)

Subject to:

w0µ ≥ m

w01 = 1

This problem is an optimization with equality constraints. We can solve it using the method of Lagrange.

We form the Lagrange function with two Lagrange multipliers λ and γ:

L(w, λ, γ) = 1 2

ΣS

ΣSc + λ(m − w

0µ) + γ(1 − w01)

An analytical solution is difficult to solve with these constraints, so one may solve this optimization problem using a constrained optimization numerical algorithm.

(45)

Chapter 7

Results

7.1

The long only portfolio problem

Using a sample portfolio of ten S&P 500 stocks it is possible to estimate for the market implied degree of herd behaviour. Figure 7.1 shows the varying degree of comovement between stocks in an equally market portfolio.

Figure 7.1: Market HIX

The market HIX is based on the stock price movements of the past 30 days and is calculated on a daily basis. For the minimized HIX portfolio, the HIX minimizing weights are calculated at the end of the entire period then applied to the portfolio, therefore, Figure 7.2 is for demonstrative purposes only. We only have a single set of weights for a single weighting period of ten years. The optimal weights are calculated retrospectively for ten stocks for long positions only.

(46)

The key observation is that the HIX is reduced. The HIX levels vary above and below each other. In general, they stay close to each other. This leads to the idea of reducing the length of each period and increasing the total number of weighting periods, by focusing on more recent market information is it possible to further reduce the overall level of HIX. It is important to recall that we assume a frictionless market. More frequent re-balancing means higher transaction costs will be incurred that will decrease portfolio value.

Figure 7.2: Market HIX

By extending and refining this HIX minimizing mechanism we can apply the method to a realistic scenario. We will analyse the results from adding an arbitrary target return and ”long only” weighting constraints to the minimization function and increasing the number of re-weighting periods.

After each 30-day period the HIX is re-weighted to minimize the HIX for the past 30 days. These weights are then applied to the next period. The advantage of this is that new information is consistently added to the model and old information is discarded. Potential for improving the model can be found by using weighting periods longer than 30 days. Exponential weighting for the could be useful to further improve the precision of results.

Figure7.3shows that the average HIX is successfully reduced subject to the constraints. Multiple peaks are observed in the minimised HIX. In these situations the minimized constraint function fails to meet the constraints. Specifically, the return constraint is not always satisfied due to poor stock returns in the previous 30 days. The best solution is to invest the entirety of wealth in a stock with the best return. When all wealth is invested in one stock we encounter perfect herd behaviour (i.e. HIX = 1), so when our

Referenties

GERELATEERDE DOCUMENTEN

If the unexpected stock returns can indeed explained by the cash flow news and discount rate news alone, the beta of these two news terms can be used in constructing a rough form

In this section, the importance of institutional trust will be discussed in relation to risk communication, as well as communication needs, citizen participation, and

This framework intends to support asset managers in improving people management by following a step-based approach to establish understanding of people and human

De verschillen tussen de mannen en de vrouwen bij de eerste meting, zijn dat de vrouwen uit de landen die eerder een aanslag hebben meegemaakt meer gebruik maken van de emotie

inspiratiebron  voor  Veilhan.. buitenstaander  te

However, the idler and the laser fundamental experience net gain in a vibrational resonance, whereas all fields experience net loss in a resonant electronic transi- tion..

By combining Kotter (1995) with Schein (2009) and Cameron &amp; Quinn (2011) with the findings from this thesis, a process is proposed to successfully develop a culture in

Concluding, based on this research and the data used in this research, stocks performing well on socially and environmental aspect give higher returns and have a lower correlation