• No results found

Seasonality in the Predictability of Government Bond Excess Returns

N/A
N/A
Protected

Academic year: 2021

Share "Seasonality in the Predictability of Government Bond Excess Returns"

Copied!
77
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Seasonality in the Predictability of

Government Bond Excess Returns

(2)

Supervisor: Prof. P. A. Bekker Co-assessor: Prof. L. Spierdijk

Author: Renxuan Wang (Student No. s1621319)

(3)

To my parents, Yan

(4)
(5)

Abstract

I used various models to make in-sample predictions on the one-year U. S. government bond excess returns. As a result, the government bond excess returns appear to have predictability, which is measured in this thesis by R2s of in-sample prediction regressions.

(6)

Preface

This thesis is the result of my graduation project in order to obtain the master’s degree in Econometrics at the University of Groningen. I would like to thank the following people for their help.

First of all, I would like to express my gratitude to Prof. Bekker, not only for his supervision on this thesis, but also for his guidance and kind help throughout these years. I very much enjoyed our discussions and meetings about econometrics and, pretty much everything. Through our discussions, I learned to be more critical, rigorous and optimistic towards research and life. I would also like to thank my second supervisor Prof. Spierdijk for her efforts.

Second, I would like to thank all the people who have helped me during my stay in Groningen. As a foreigner, my days abroad would be much more difficult without their help. In particular, many thanks go to Prof. Van der Vlerk, who helped me greatly during the ID problem in my first year. Moreover, thanks Mr. Knypstra and dr. Schoonbeek for their kind help during my application and my study. I would also like to thank board members of the ACSSG and Verininging Nederland China, and particularly, dr. Ning Ding and Mr. Kolkman for their encouragement. In addition, I am also grateful for Nuffic, for their generous Huygens Scholarship for me.

(7)

Contents

Preface vii

1. Introduction 1

1.1. Research Questions and Introduction . . . 1

1.2. Literature Review Related to Forecasting Bond Prices . . . 3

2. Basic Concepts and Notations 5 2.1. Bond Prices, Yields and Forward Rates . . . 5

2.2. Bond Excess Returns . . . 7

3. A Review of Yield Models 9 3.1. The Nelson-Siegel Model and Its Arbitrage-free Extension . . . 9

3.1.1. The Nelson-Siegel Model . . . 9

3.1.2. The Arbitrage-Free Nelson-Siegel Model . . . 10

3.2. The Svensson Model . . . 13

3.3. The Legendre Polynomial Model . . . 15

3.4. Fitting performances . . . 18

3.4.1. The Data . . . 19

3.4.2. In-Sample Fitting Results . . . 19

4. Predicting Bond Excess Returns 21 4.1. The Cochrane and Piazessi model . . . 21

4.2. Bond Excess Return Prediction . . . 23

5. Seasonality in the Predictability of Excess Returns 26 5.1. Predicting one-year Bond Excess Returns: A SUR Framework . . . 27

(8)

5.1.2. Feasible Generalized Least Squares Estimation . . . 29

5.2. The Time-varying Predictability in Bond Excess Returns . . . 30

5.3. Testing Seasonality Based on the SUR model . . . 33

5.3.1. Restricted Covariance Matrix . . . 33

5.3.2. Maximum-likelihood Estimation . . . 35

5.3.3. A Likelihood Ratio Test . . . 40

6. Conclusion 47

A. Derivation of AFNS Model 52

B. The Kalman Filter 56

C. Detailed Results Bond Excess Return Regressions 59

(9)

Chapter 1.

Introduction

1.1 Research Questions and Introduction

Government bonds are not only important fixed-income instruments that take up large proportion of investment portfolios but also key economic indicators that are closely watched by economists and central bankers around the world. Thus, understanding the dynamics of bond returns becomes extremely important. One of the first questions researchers would ask is

“ Can we predict the bond prices at all? ”

Many existing papers1 have investigated this question. Among them, some have found

evidence that the bond returns are predictable. In particular, in a recent paper, Cochrane and Piazessi (2005) showed that, by using one to five-year forward rates to predict2

the one-year bond excess return, they find an R2 as high as 35%. Furthermore, many

papers3 have focused on modeling the term structure of interest rates directly, and found

impressive out-of-sample forecasting performances using their model. Starting from Litterman and Scheinkman (1991), one of the most popular approaches is to extract a few factors that are responsible for most of the variations of the yield curve movements. These few factors are often interpreted as the level, slope and curvature of the yield curves. In Section 3, a review of the most popular models among this class is provided. Subsequently, following the approach used in Cochrane and Piazessi (2005), I will also

(10)

use the level, slope and curvature to predict the one-year U. S. government bond excess returns. I showed that, these level, slope and curvature type models display similar in-sample forecasting performances as the Cochrane and Piazessi model, in terms of R24.

The results provide evidence that there is predictability in bond excess returns.

Many existing papers (e.g. Keim and Stambaugh (1986), Bondt and Thaler (1985) and etc..) have investigated the seasonal effects in returns of stocks and low-graded corporate bonds and they found evidence of seasonal effects. It is also of great interest to carry out such investigation for the government bond market. In fact, few papers have done so. Among these papers, Smirlock (1985), Smith (2002) and etc. find no sign of seasonal effects in high-grade bond market. In this paper, instead of trying to find seasonal effects in the returns of US government bonds, I investigate if there is seasonal effects in the predictability of bond excess returns, measured by R2s. More explicitly, I

ask the following question:

“Is the predictability time-varying? Or more precisely, are there seasonal effects in the predictability?”

My findings answer the question directly and show that the predictability in bond excess returns is not only time-varying, but also displays a seasonal effect at the end of each year culminating in January. This result seems to be novel. To my understanding, none of existing papers have found a similar one.

Furthermore, I use a new approach to study the seasonal effect. Past papers mainly add monthly dummy variables, such as the month January (for example, Smirlock (1985) and Keim and Stambaugh (1986)) to test seasonal effects. In order to investigate the seasonal effect in the predictability of bond excess returns, I based my test on a Seemingly Unrelated Regression (SUR) model. This model seems to be more appropriate than the one used by Cochrane and Piazessi (2005). Furthermore, by restricting the variance-covariance matrix in the SUR model, I could employ maximum-likelihood estimation to estimate the model. Consequently, likelihood ratio tests are performed based on the maximum-likelihood estimates in order to verify the seasonal effects. Consequently, I found that the presence of the seasonal effect in the predictability of bond excess returns is statistically significant.

4The level, slope and curvature of the yield curve produce R2s as high as 31%, despite they often have

(11)

1.2. Literature Review Related to Forecasting Bond Prices This thesis is organized as follows. Reviews of the literature on forecasting bond prices and seasonal effects will be provided in the rest of this chapter. In the second chapter, basic notations and concepts related to the bond markets and term structure of interest rates will be introduced. The third chapter will be focusing on predicting bond excess returns using various models. In particular, the models that we will use to forecast the bond excess returns will be reviewed and the forecasting performances will be displayed. The seasonality of the predictability will be treated in Chapter 4. In this chapter, I will first introduce the SUR model that I am going to use to investigate the seasonal effects . Subsequently, the estimation procedure and the tests on the existence of the effects will be performed. The last section concludes this thesis.

1.2 Literature Review Related to Forecasting Bond

Prices

Much research has been done to understand the evolution of bond prices and the re-lationship between long-term and short-term bonds. Predominant theory includes the theoretical appealing Expectation Hypothesis (Fama (1984), Stambaugh (1988)), which asserts that the long-term bond yields are averages of the expected future short yields. However, many papers have tested the hypothesis in bond markets and found evidence that reject it. Shiller (1979) showed that the volatility of long-term interest rates ac-tually exceeds the limits implied by the expectation hypothesis. Based on this excess volatility or risk premium exists in long-term bond, he concluded that “there is a kind of predictability in long-term bonds”. This “risk premium” was further examined by subsequent research papers. Fama and Bliss (1987) modeled this excess return by using one-year excess returns. The one-year excess return5 on a n-year (n ≥ 2) bond, say, is

obtained as follows. First, borrow at one-year rate and buy an n-year bond. Hold the bond for a year and sell it as an (n− 1) year bond. The one-year excess return on the n-year bond is the return after paying off the one-year loan. This term captures the risk premium of holding a long-term bond, rather than holding a short-term bond. In their paper, Fama and Bliss used the spread between n-year forward rate and the one-year yield to predict the one-year excess return on a n-year bond, and found an R2 of 18%.

(12)

In a more recent paper, Cochrane and Piazessi (2005) extended Fama and Bliss work. They used the one to five- year forward rates, along with a constant, to predict the one-year excess returns on two to five- year bonds, and found an R2 of 35%.

Other than trying to forecast the excess returns, much effort has been made to model the yield curves directly. Many models are built on assumptions of arbitrage-free or market equilibrium. Representative theoretical term structure models are no-arbitrage models6 and equilibrium models7. However, though theoretically rigorous, these models

often exhibit poor empirical performance, as shown in Duffee (2002). Moreover, the estimation procedures for these models are often problematic and time-consuming (Kim and Orphanides (2005)). Instead of modeling the term structure models based on market assumptions or economic theory, many researchers simply looked into the empirical properties of the yield curve movements and observed that several common “factors” are responsible for most of the variations of bond returns (Bliss (1997)). These factors are interpreted as the “level”, “slope” and “curvature” of the yield curve (Litterman and Scheinkman (1991)). Representative factor models include the principal component model (Litterman and Scheinkman (1991)), the Legendre Polynomial (Almeida et al. (1998)) and the Nelson-Siegel class of models8. Though the factor models lack the

theoretical appealing as the no-arbitrage models, these models, particularly the Nelson-Siegel class of models, perform well in forecasting the bond yields and are favored by practitioners9. Among the Siegel class of models, the most popular is the

Nelson-Siegel model and the Svensson model. Diebold and Li (2006) use the Nelson-Nelson-Siegel model to forecast U. S. government bond yields at a point in time. The Svensson model is similar to the Nelson-Siegel model, except for an additional factor. As shown in Diebold and Li (2006), the three factors in the Nelson-Siegel model could be interpreted as level, slope and curvature of the yield curve10. Furthermore, Christensen et al. (2009) derived an free Nelson-Siegel model, which makes the Nelson-Siegel model arbitrage-free.

6Prominent contributions in the no-arbitrage vein include Heath et al. (1992) and Hull and White

(1990)

7Prominent contributions in the affine equilibrium tradition include Vasicek (1977), Cox et al. (1985)

and Duffie and Kan (1996).

8A good summary is written by de Pooter (2007)

9According to a BIS (2005), nine out of thirteen central banks use either of the two models or a

combination of the two models.

(13)

Chapter 2.

Basic Concepts and Notations

In this chapter, I will introduce the key terms and concepts in bond markets. In the process, notations will be established.

2.1 Bond Prices, Yields and Forward Rates

Bond prices, yields and forward rates are the most commonly used terms in bond mar-kets. Furthermore, they are also the most important theoretical concepts in modeling the term structure of interest rates. In this section, I will briefly introduce these three concepts and discuss the relationship between them. A more detailed discussion can be found, for example, in Fabozzi (1995).

(14)

Another very important concept is the time to maturity. Suppose we are at time t, and denote a time instance in the future, T , as the maturity date, so t ≤ T . The time to maturity τ is defined as T − t.

Let Pt(τ ) denote the price or the present value of a zero-coupon bond of maturity

τ -month at time t. The yield to maturity, or simply the yield, of the zero-coupon bond at time t, is the continuous discount rate at which the future 1$ payment at time T equals to Pt(τ ). Let yt(τ ) denote the yield, we have

Pt(τ ) = exp(−τyt(τ )). (2.1)

If one investor decides at time t that he wants to buy a zero-coupon bond in the future, say at time T (T = t + τ ), which matures at time T + s, s > 0, he needs to get a forward contract with the issuer. In order to specify the price of the zero-coupon bond at time T , the forward contract, essentially, should specify the interest at time T . Based on the yield at t, yt(τ ), the investor considers his two options:

1. Wait till T + s to get a return of exp((τ + s)yt(τ + s));

2. At time T, get a return of exp(τ yt(τ )) , and reinvest it with an interest of yT(s)

at time T .

Let ft(T, T + s) denote the forward rate at time t for a forward contract starts at T and

ends at T + s. The forward rate is fixed based on the assumption that the two options above would yield the same total return, i.e.

eτ yt(τ )esft(T,T +s)= e(τ +s)yt(τ +s),

from which we could solve the forward rate ft(T, T + s) = 1 s[(τ + s)yt(τ + s)− τyt(τ )] (2.2) = 1 sln � Pt(τ ) Pt(τ + s) � . (2.3)

In addition, if we let s→ 0, we obtain the spot forward rate at time T , which we denote as ft(T ). In the continuous time frame work, the yield yt(τ ) could be defined as an

(15)

2.2. Bond Excess Returns Equations (2.2) and (2.3) summarize the relationships among the forward rates, the yields and the bond prices. Without any further assumptions, it is apparent that the forward rate is no more than a function of yields and bond prices. Therefore, when forecasting the bond excess returns, using information contained in forward rates is actually equivalent to using information in yields or bond prices.

2.2 Bond Excess Returns

In this paper, the one-year U. S. government bond excess returns are the variables I am going to forecast using various models. In this section, I will discuss these excess returns in more detail.

Following the definition in Cochrane and Piazessi (2005), the bond excess returns are actually log one-year bond excess returns. Let’s first define the one-year log holding period return on a τ -month bond by buying one τ -month bond at time t, and sell it as an (τ − 12)-month bond at 12 months later, i.e.

rt+12(τ ) = ln (Pt+12(τ − 12)) − ln(Pt(τ )). (2.5)

The one-year excess return on the τ -month bond at time t + 12 is the holding period return in excess to the one-year log bond price (which is smaller than zero),

Rt+12(τ ) = rt+12(τ ) + lnPt(12). (2.6)

As we could see from (2.6), the one-year excess return actually measures the term pre-mium of buying a long term bond rather than a short-term bond. Furthermore, if we write out the Rt+12(τ ) in terms of the bond prices, we have

Rt+12(τ ) = ln Pt+12(τ − 12) Pt(τ ) + lnPt(12) = lnPt(12)Pt+12(τ − 12) Pt(τ ) . (2.7)

The Expectation Hypothesis, mentioned in Chapter 1, asserts that the long yields yt(τ + s) is the weighted average of the future short yields. For example, this hypothesis

(16)

Figure 2.1.: One-year excess returns on 2, 3 and 5- year bonds

Notes: The one-year excess returns on 2, 3 and 5- year zero coupon bonds. The zero coupon bond prices are computed based on the CRSP zero-coupon yields from July, 1953 to December, 2000.

From (2.1), this consequently implies that the (τ + s)-month zero-coupon bond price at time t should be a compounded price of a τ -month bond at time t and an s-month bond at time T :

Pt(τ + s) = Pt(τ )Pt(s). (2.8)

Therefore, if the “Expectation Hypothesis” were true, Rt+12(τ ) defined in (2.7) should

always be zero. So we do not really have to predict anything.

(17)

Chapter 3.

A Review of Yield Models

In this chapter, I will briefly describe two classes of models, namely, the Nelson-Siegel class of models and the Legendre Polynomial model. These two classes of models are often used to model the yield curve. In this thesis, I will use them to make in-sample prediction for bond excess returns. For each model, not only the function forms of the models, but also the estimation methods will be described. In the end of this chapter, their in-sample fitting performances will also be compared.

3.1 The Nelson-Siegel Model and Its Arbitrage-free

Extension

3.1.1. The Nelson-Siegel Model

Based on a popular mathematical approximating function, namely the Laguerre func-tion1, Nelson and Siegel (1987) model the spot forward rate as follows

ft(T ) = β1N S+ β2N Se−λτ + β3N Sλe−λτ + �t(τ ).

In a more recent paper, Diebold and Li (2006) considered the transformed2 version of

the original Nelson-Siegel (NS) model to model the yields: yt(τ ) = β1tN S + β2tN S( 1− e−λτ λτ ) + β N S 3t ( 1− e−λτ λτ − e −λτ) + � t(τ ). (3.1)

(18)

The added subscript t in the factors indicates that Diebold and Li (2006) allow βN S 1t ,

βN S

2t and β3tN S to follow AR(1) processes. Therefore, Diebold and Li (2006) name the

model the “Dynamic Nelson-Siegel” (DNS) model.

If λ is fixed beforehand, the components �1 1−eλτ−λτ 1−eλτ−λτ − e−λτ�could simply be

viewed as regressors. They are called “factor loadings” by Diebold and Li (2006). The second and the third factor loadings have the following limits with respect to τ :

limτ→01−e −λτ λτ = 1, limτ→∞ 1− e−λτ λτ = 0; (3.2) limτ→01−e −λτ λτ − e−λτ = 0, limτ→∞ 1− e−λτ λτ − e −λτ = 0. (3.3)

From (3.2) and (3.3), we could find the short rate and the long rate of the DNS model limτ→0y(t, T ) = β1t+ β2t and limτ→∞y(t, T ) = β1t.

Furthermore, the different structure of the factor loadings also determines their roles in capturing yields of different maturities. Figure 3.1 shows the shape of the factor loadings for different τ . From this figure, we could observe that when τ is very small, 1−eλτ−λτ is at its peak and therefore makes most of its contribution. At the long end of the yield curve, when τ is very big, only the first factor loading does not converge to zero, so it also plays the role of capturing the “level” of the yield curve. Finally, the third component has a “hump-shape”, which peaks at the medium term of maturity. This feature determines its prominent role in capturing the shape of the medium-term yields. However, we should notice that the maturity, at which this loading peaks, depends on the value of λ. As shown in Figure 3.1, the smaller the value of λ, the longer the maturity the third loading peaks.

Given a fixed λ, we estimate (3.1) by Ordinary Least Squares3. In fact, there are

other more complicated ways to estimate the model, such as non-linear least squares and maximum-likelihood estimation. However, as shown in Bolder and Liu (2007), OLS estimation is a superior approach in estimating the Nelson-Siegel model compared to the others.

3.1.2. The Arbitrage-Free Nelson-Siegel Model

Christensen et al. (2009) extended the DNS model to make it arbitrage-free. This

(19)

3.1. The Nelson-Siegel Model and Its Arbitrage-free Extension

Figure 3.1.: Factor loadings of Nelson-Siegel Model

“arbitrage-free Nelson-Siegel model” (AFNS) is in fact a subclass of the standard affine continuous-time arbitrage-free models as summarized in Duffie and Kan (1996)4. This

subclass has the feature that it has a very similar function form to the DNS Model. As shown in Christensen et al. (2009), this model takes the following form:

yt(τ ) = βAF N S1t + β2tAF N S( 1− e−λτ λτ ) + β AF N S 3t ( 1− e−λτ λτ − e −λτ)C(t, T ) τ + �t(τ ), (3.4)

where the analytical form of C(t,T )τ is provided in (A.11). Compared to the Nelson-Siegel model, this additional term C(t,T )τ is actually the only difference between the function forms of DNS and AFNS models. Furthermore, compared to the DNS model, this term has very limited influence when applying AFNS model to fit the yield curve5.

Consequently, the AFNS model preserves most of the empirical properties of the DNS model while being more theoretically thorough. Due to these virtues of the AFNS model, I will use the estimated factors of the AFNS model to forecast the excess returns.

4More detailed background and derivation of the AFNS model, as well as the standard affine

continuous-time arbitrage-free models, are discussed in Appendix A.

(20)

Due to the presence of C(t,T )τ , the estimation procedure of the AFNS factors βAF N S 1t ,

βAF N S

2t and β3tAF N S is a bit more complicated than that of the DNS model. In this paper,

We use the Kalman filter6 to estimate them. In order to do so, it is convenient to

first write the AFNS model in a state-space representation7, which consists of a state

equation and a measurement equation.

First, let Zt = (β1tAF N S, β2tAF N S, β3tAF N S)� denote the latent variables. The state

equa-tion specifies the dynamics of the latent variables and is given by

Zt = (I3− A)µ + AZt−1+ ηt, ηt ∼ N(0, Q), (3.5) where A =     a11 0 0 0 a22 0 0 0 a33   

 is the transition matrix and µ =     µ1 µ2 µ3     is the uncondi-tional mean of Zt. Notice that we assume the latent variables to be independent, so

the covariance matrix Q is a 3× 3 diagonal matrix. This assumption would simplify the estimation procedure by reducing the number of parameters and thus improve the accuracy of the estimation. Furthermore, as pointed out by Diebold et al. (2006) and Christensen et al. (2009), assuming the factors to be correlated does not improve much the out-of-sample prediction performance of the model.

Second, the measurement equation is given by

yt = HZt+ C + �t, �t ∼ N(0, R), (3.6)

where R is a diagonal matrix with jth diagonal element, j = 1, ..., n, equal to σ2 �(τj), yt =     yt(τ1) ... yt(τn)     , H =     1 1−eλτ−λτ1 1 1−e−λτ1 λτ1 − e −λτ1 ... ... ... 1 1−eλτ−λτn n 1−e−λτn λτn − e −λτn     and �t =     �1t ... �nt     . In particular, C = (C(τ1) τ1 , . . . , C(τn) τn )

is a vector of the yield adjustment terms, whose

analytical form is shown in (A.12).

The parameters in A , µ and Q in (3.5), and λ in (3.6) constitute the parameter set8,

ψ, say. Let ˆ�t denote the measurement residuals defined as

ˆ

�t= yt− H ˆZt|t−1+ C,

6A brief introduction of Kalman filter is provided in Appendix B

7For more explanations on the state-space representation and the latent factor approach, see Diebold

et al. (2006).

(21)

3.2. The Svensson Model where ˆZt|t−1 is the priori estimates for Zt when using the Kalman filter9. Let St be

the covariance matrix of ˆ�t, we could estimate the parameters by maximizing the

log-likelihood function L(ψ) = −nT 2 log2π− 1 2 T � t=1 log|St| − 1 2 T � t=1 ˆ ��tSt−1�ˆt. (3.7)

In order to calculate the standard errors of the parameter estimates, I used the the “BHHH” estimator 10 to estimate the variance of the estimated parameters. This

esti-mator is of the form

� V ar( ˆψ) = � T � t=1 ∂Lt( ˆψ) ∂( ˆψ) ∂Lt( ˆψ)� ∂( ˆψ) �−1 where ∂Lt( ˆψ)

∂( ˆψ) are computed using the numerical gradient for each contribution of the

log-likelihood function.

The estimated factors are plotted in Figure 3.2. In Diebold and Li (2006), the three factors are interpreted as the “level”, “slope” and the “curvature” of the yield curves.

3.2 The Svensson Model

Svensson (1994) extended the Nelson-Siegel model to make it more flexible. As the DNS model, the extended model is also popular among central bankers11. Compared to the

Nelson Siegel model in (3.1), it adds a fourth component, with an extra parameter λ2

and is given by yt(τ ) = β1tSV + β2tSV( 1− e−λ1τ λ1τ ) + β3tSV(1− e−λ 1τ λ1τ − e −λ1τ) (3.8) + β4tSV(1− e−λ 2τ λ2τ − e −λ2τ) + � t(τ ).

The fourth factor loading of the model, (1−eλ−λ2τ

2τ − e

−λ2τ), which is of the same form as

the third factor loading, enables the model to fit the yield curve better for medium-term

matrix as a fixed scalar (0.00035) multiple of an identity matrix . The reason is that estimating the R will dramatically increase the number of parameters to estimate, and thus makes the estimation procedure unstable and inefficient. In addition, Diebold et al. (2006) did not provide any information about the starting value of R on the estimation results.

9For more details about this estimates, see (B.3)

(22)
(23)

3.3. The Legendre Polynomial Model yields. As shown in Figure 3.1, adding a factor loading with a different λ enables the Svensson model to be more flexible. Consequently, compare to the NS model, it will perform better in capturing yield curves of more complicated shapes, especially yield curves with two local maximums or minimums.

Similar to the DNS model, if we fix λ1 and λ2first, βitSV, i = 1, . . . , 4 could be estimated

simply by OLS. However, as the third and the fourth factor loadings are of exactly the same form, multicollinearity problems could easily arise12 if we do not specify λ

1 and

λ2 properly. In order to avoid this problem, I considered the adjusted Svensson model

proposed by de Pooter (2007), which takes the form

yt(τ ) = β1tSV + β2tSV( 1− e−λ1τ λ1τ ) + β3tSV(1− e−λ1 τ λ1τ − e −λ1τ) (3.9) + β4tSV(1− e−λ2 τ λ2τ − e −2λ2τ) + � t(τ ).

(3.9) is different from (3.8) only in the last factor loading, where −e−λ2τ is replaced by

−e−2λ2τ.

In the empirical application, I used OLS13 to estimate (3.9) and specified λ

1 = 0.065

and λ2 = 0.015 14. Based on this specification, the third and fourth factor loadings of

(3.9) will peak at 36 and 64 months, which are appropriate medium terms to maturities. These factor loadings for the adjusted Svensson model as functions of τ are plotted in Figure 3.3. The OLS estimates of βSV

i,t , i = 1, 2, 3, 4, given specifications of λ1 and λ2

are plotted in Figure 3.4.

3.3 The Legendre Polynomial Model

Almeida et al. (1998) proposed a term structure model using Legendre polynomials. Let ki(x) : [−1, 1] → [−1, 1] denote the ith-order Legendre polynomial. The first five

12Many papers have discussed this issue, see for example Bolder and Streliski (1999) and Gimeno and

Nave (2006)

13In fact, I tried other estimation procedures. For example, I allowed λ

1 and λ2 to vary over time,

and used Non-linear Least Square to estimate (3.8) and (3.9). However, frequently, problems of multicollinearity arise, when no bounds have been specified. Even if I specify bounds for βSV

i,t as

well as for λ1,t and λ2,t, the results are very sensitive to the bounds and often unstable. OLS

estimation, however, turned out to be the most stable and robust approach.

14These two parameter values were chosen so that the estimated β

3,t and β4,t do not take on large

(24)

Figure 3.3.: Factor loadings of the adjusted Svensson Model

Legendre polynomials take the following form: k1(x) = 1; k2(x) = x; k3(x) = 1 2(3x 2− 1); k4(x) = 1 2(5x 3− 3x); k5(x) = 1 8(35x 4− 30x2+ 3).

The Legendre polynomials have the special property that they are orthogonal to each other with respect to theL2 inner product on [−1, 1], i.e.

� 1 −1 ki(x)kj(x)dx. = � 2 2n+1, if i = j 0, if i�= j

(25)
(26)

{1, x, x2, . . .}. To illustrate the shape of the polynomials, the first five polynomials are

plotted in Figure 3.5.

Figure 3.5.: Legendre polynomials

Following Almeida et al. (1998), in order to model the term structure, we take x = 2τ

l − 1,

where l is the longest maturity given a set of yields. As ki now is actually a function of τ ,

we could write ki(τ ) instead of ki(x). Consequently, we could model the term structure

by

yt(τ ) = k1(τ )β1tLG+ k2(τ )β2tLG+ . . . + k5(τ )β5tLG+ �t(τ ). (3.10)

As ki(τ ), i = 1, . . . , 5 are fixed, βitLG can be estimated by OLS.

3.4 Fitting performances

(27)

3.4. Fitting performances

3.4.1. The Data

For all the empirical application in this thesis, I use U. S. government zero coupon yields. These zero yields are constructed from end-of-month price quotes (bid-ask average) for U.S. Treasuries, from January 1970 through December 2000, taken from the CRSP government bonds files. CRSP filters the data, eliminating bonds with option features (callable and flower bonds), and bonds with special liquidity problems (notes and bonds with less than one year to maturity, and bills with less than one month to maturity), and then converts the filtered bond prices to unsmoothed Fama and Bliss (1987) forward rates. Subsequently, we convert these forwards rates into yields of zero coupon bonds. Furthermore, the maturities that I take are 24, 30, 36, 48, 60, 72, 84, 96, 108, and 120 months.

3.4.2. In-Sample Fitting Results

Table 3.1 summarizes the fitting performances of the AFNS model, the Svensson model and the Legendre polynomial model.

This table shows that, compared to the other two models, the AFNS model performs the worst. Besides that it has less factors, the other reason is that the estimation method we used to estimate the factors are maximum-likelihood estimation instead of least square estimation. Consequently, the estimator is more efficient compared to the least square estimator, but will not necessarily yield the best fit. In addition, the mean residuals of the AFNS model are positive. This results from the presence of the additional term −C(t,T )τ in the model. Among the three models, the Legendre polynomial model

performs best in terms of mean residuals and RMSE. This is not surprising considering the fact that this model has the most number of factors.

(28)

Table 3.1.: Fitting performances of three models

Model AFNS Svensson Legendre

τ (month) MR RMSE MR RMSE MR RMSE

24 0.81 21.15 −0.04 1.87 0.16 3.20 30 1.31 21.11 0.32 5.66 −0.16 5.43 36 1.31 20.79 0.16 4.26 −0.29 4.71 48 1.72 20.41 0.45 6.97 0.84 5.94 60 −1.50 19.26 −2.63 7.05 −1.60 6.00 72 2.22 18.27 1.38 7.36 2.16 7.80 84 −0.37 17.32 −0.96 9.33 −1.28 8.76 96 1.71 16.75 1.25 8.76 −0.30 6.69 108 2.75 17.00 2.17 10.12 0.68 6.39 120 −1.13 16.68 −2.12 8.02 −0.21 2.03 Mean 0.88 18.88 0.00 6.94 0.00 5.70

(29)

Chapter 4.

Predicting Bond Excess Returns

Cochrane and Piazessi (2005) use one to five year forward rates to forecast one-year bond excess returns in-sample. I will briefly review their approach first in this chapter. Subsequently, I will generalize their approach and use the factors of various yield models to make in-sample predictions for one-year bond excess returns. At the end of this chapter, the in-sample forecasting performances of the yield model factors along with the performances of the Cochrane and Piazessi approach will be compared. In addition, the overlapping data problem in the Cochrane and Piazessi approach will also be discussed.

4.1 The Cochrane and Piazessi model

As mentioned in Section 1.2, Fama and Bliss (1987) tries to use forward rates to forecast bond excess returns and found evidence that they have predicting power. Their model takes the form

Rt+12(τ ) = a2(τ ) + b2(τ )[ft(T, T + s)− yt(1)] + ut+12(τ ), t = 1, 2, 3, . . . (4.1)

where a2(τ ), b2(τ ) are parameters and ut+12(τ ) are error terms, all depending on bond

maturity τ .

Cochrane and Piazessi (2005) extended (4.1) and used one to five- year forward rates to forecast the one-year excess returns. Their model has the following form:

(30)

is a 6× 1 right-hand-side vector and

b(τ ) =�α1(τ ) b1(τ ) b2(τ ) b3(τ ) b4(τ ) b5(τ )

��

is a 6× 1 vector of parameters depending on the term to maturity, τ. It is notable that the regressors Ft do not change with the maturities of the dependent variable.

Furthermore, Cochrane and Piazessi (2005) showed that b(τ ), τ = 2, 3, 4, 5 have similar “tent shape”. Using the same dataset as in Cochrane and Piazessi (2005), The Ordinary Least Squares estimates of b(τ ), τ = 2, 3, 4, 5 are plotted in Figure 4.1. Based on this

Figure 4.1.: The OLS estimates of b(τ )

Notes: The zero coupon bond yields that the factors estimated based on are the CRSP zero-coupon yields from July, 1953 to December, 2000.

“tent-shaped structure”, Cochrane and Piazessi (2005) proposed a restricted version of the model Rt+12(τ ) = (Ft�b)γ(τ ) + ut+12(τ ), 5 � τ =2 γ(τ ) = 1, (4.3) where γ(τ ) is a scalar.

(31)

4.2. Bond Excess Return Prediction the average (across maturity) excess return on all forward rates,

1 4 5 � τ =2 Rt(τ ) = Ft�b + ¯ut+12, ¯ut+12 = 5 � τ =2 ut(τ )

to obtain OLS estimates of b, which we denote as ˆb. Subsequently, they regress Rt+12(τ )

on ˆb to obtain the estimates for γ(τ ). In this way, by posing the restriction γ(τ )b = b(τ ),

they could use one factor, a combination of forward rates, γ(τ )b to estimate the excess returns. They claim that this hidden “tent-shaped structure” could provide predictive power that was often neglected by the other models and is unrelated to the level, slope and curvature of the yield curve.

4.2 Bond Excess Return Prediction

In the previous section, I reviewed three term structure models and showed how to use them to fit yield curves. Having obtained their factors, I will use them to forecast the one-year bond excess returns and compare the results with those of Cochrane and Piazessi model. In particular, issues related to estimation, such as overlapping data problem will also be discussed.

We generalize (4.2) and we consider the bond excess return prediction regression model Rt+12(τ ) = Xt�b(τ ) + ut+12(τ ), t = 1, 2, 3, . . . , n, (4.4)

where Xt consist of factors estimated from term structure models or simply forward

rates, in addition to a constant. For example, if we want to use the Svensson factors to estimate the one-year excess returns,

Xt = (1, ˆβ1tSV, ˆβ2tSV, ˆβ3tSV, ˆβ1tSV)�,

where ˆβSV

it , i = 1, 2, 3, 4 are OLS estimates obtained from (3.9). b(τ ) is a vector of

parameters that we are trying to estimate. The dimensions of vectors Xt and b are

defined to be appropriate.

(32)

application. Taking the Cochrane and Piazessi’s approach, I regress monthly data of Rt+12(τ ) on Xt to obtain OLS estimates for b(τ ), separately for each τ . The resulting

R2s for different models and for different maturities are presented in Table 4.11. Table

Table 4.1.: Summary of R2 of one-year bond excess return

re-gressions based on different models

τ Arbitrage-free Svensson Legnedre Cochrane

(Year) Nelson-Siegel -Polynomials -Piazessi

2 0.31 0.30 0.29 0.35

3 0.30 0.31 0.30 0.37

4 0.30 0.31 0.31 0.39

5 0.30 0.31 0.31 0.36

Notes: The dataset that the models applied to are the monthly CRSP zero coupon yields, from Jan. 1970 to Dec. 2000. The first column shows the maturities of the long bond that the bond excess return is based on, expressed in years.

4.1 shows that the AFNS model, Svensson, Legendre and the Cochrane and Piazessi model perform quite similar in forecasting the one-year bond excess returns, in terms of R2s. The Cochrane and Piazessi model performs slightly better. Furthermore, the

table also shows that these models performs equally well across maturities, with the R2s

around 0.3 to 0.4.

Notice that, compared with the AFNS (3 factors) and the Svensson (4 factors) model, the advantage of Cochrane and Piazessi model (5 factors) is partly due to the fact that it has extra factors. Moreover, it is also impressive that with only three factors, two less than the Cochrane and Piazessi, the AFNS model predicts as high as an R2 of 0.31. On

the other hand, the performance of the Legendre polynomials are disappointing. With five estimated factors, its forecasting R2s are sometimes even lower than the AFNS

model.

However, the estimation approach we employed to estimate b(τ ) has a potential

draw-1Detailed results regarding these regressions, including parameter estimates, standard errors as well

(33)

4.2. Bond Excess Return Prediction back. Though the exogeneity of Xtwould ensure the consistency of the OLS estimator,

the conventional estimator for the standard error will not be consistent. This has two reasons. First, as financial time series, the bond returns are not independent, so the error terms are not independent. Moreover, the one-year bond excess returns are computed based on a horizon of a year, while the right-hand-side variables are available for each month, this discrepancy would cause the error terms in (4.4) to follow a moving average process and therefore they are correlated with each other.

In order to obtain consistent estimates for the standard errors, we will use the Newey-West estimator (Newey and Newey-West (1987)) to estimate the variance-covariance matrix of the OLS estimates. This estimator is given by

ˆ ΣN W = ˆΓ(0) + p � j=1 (1 j p + 1) � ˆ Γ(j) + ˆΓ(j)��, (4.5) where ˆ Γ(j) = 1 n n−1 � t=j+1 ˆ utuˆt−jXt�Xt−j,

with ˆut the OLS residuals.

In fact, using the Newey-West estimator, we could compute the Feasible Generalized Method of Moment (FGMM) estimator, a more efficient estimator than OLS given by

ˆ

bGLS(τ ) = (X�(X ˆΣ−1N WX�)X)−1X�(X ˆΣ−1N WX�)R(τ ), (4.6)

where X� = (X

1, . . . , Xn) and R(τ )� = (R1(τ ), . . . , Rn(τ )).

The overlapping data problem will not occur, if we use monthly data to predict one-month bond excess return, instead of one-year bond excess return. So, if the one-one-month holding period return on a τ -month bond is defined by

rt+1(τ ) = ln(Pt+1(τ − 1)) − ln(Pt(τ )),

we could model the one-month bond excess return on a τ -month bond as

Rt+1(τ ) = Xt�b(τ ) + �t+1(τ ), ut+1∼ i.i.d(0, σ�2), t = 1, 2, 3, . . . , n, (4.7)

where

Rt+1(τ ) = ln(Pt+1(τ − 1)) − ln(Pt(τ ))− yt(1),

and yt(1) denotes the 1-month yields at time t. In this case, OLS estimates will yield

(34)

Chapter 5.

Seasonality in the Predictability of

Excess Returns

In the previous chapter, I showed that various models perform well in making in-sample predictions for the one-year bond excess, in terms of R2s. These R2s are based on

the regressions of the overlapping one-year bond excess returns on the level, slope and curvatures of the yield curves (the AFNS model, the Svensson model and the Legendre polynomials) or the forward rates (the Cochrane and Piazessi model). However, as I discussed, the potential drawbacks of this regression (overlapping data problem and the serial correlation of the bond excess returns) would affect the accuracy of the estimation, as well as the R2s1 .

This overlapping data problem would not arise, if we use non-overlapping data to run the regression. However, as we have monthly data for 30 years, using only one series of non-overlapping data is, in some sense, a waste of information. Moreover, the choice of using which month as the starting month is also difficult to make. In this thesis, I come up with a “Seemingly Unrelated Regression” (SUR) model2 , which reorganizes these

overlapping data. Based on this model, I will show in this chapter that the predictability in the bond excess returns is time-varying throughout each year. Furthermore, the level, slope and curvature of the yield curves, appear to have very high predictive power in January and in the last quarter of a year.

1In Harri and Brorsen (1998), they discussed extensively about the overlapping data problem. In

particular, Thornton and Valente (2009) showed that the use of the overlapping data is the main reason of the high R2 obtained in Cochrane and Piazessi (2005).

(35)

5.1. Predicting one-year Bond Excess Returns: A SUR Framework Based on these findings, I will propose a SUR model with a restricted covariance matrix that would capture these effects. Furthermore, using this structure, the statistical significance of these effects can be tested using a likelihood ratio test.

5.1 Predicting one-year Bond Excess Returns:

A SUR Framework

5.1.1. The SUR Model

In order to construct the SUR model, I need to first reorganize the data series for each variable. For each original data, starting in different months throughout a year, we have 12 new series. Each of these 12 series consists of data collected in the same month for different years. For example, for the one-year excess return, the first series of the 12 series consist of only January data and is given by

{R1+12(τ ), R1+24(τ ), R1+36(τ ), . . . , R1+12×n(τ )}.

Similarly, the fourth series starts in April, which is

{R4+12(τ ), R4+24(τ ), R4+36(τ ), . . . , R4+12×n(τ )}.

Such construction is performed for each of the explanatory variables, including the AFNS factors, forwards rates and etc. For each variable, our SUR model utilizes all 12 non-overlapping series. Therefore, for predicting one-year bond excess returns, the SUR model consists of the following equations:

Ri = xiβi+ ui, i = 1 . . . 12, (5.1)

where Ri : n×1, denotes the one-year excess return on a two year bond3. xi : n×k, is the

non-overlapping predicting factors that are either estimated level, slope and curvatures of yield curves (AFNS, Svensson or Legendre polynomials) or forward rates (Cochrane and Piazessi model). The number of columns, k, depends on the model we use to predict.

3Notice that since we will focus on the time-varying property in the predictability of the one-year

(36)

For example, when we want to use the Svensson model, i.e. xi = (1, ˆβ1iSV, . . . , ˆβ4iSV), we

have k = 5. In addition, we name the SUR model based on different factors we use. For example, we name the SUR model using the Svensson factors to predict the bond excess return by “Svensson-SUR”.

We assume the variables in xi, i = 1, . . . , 12 to be exogenous. We assume that within

each vector ui, the errors are independent and identically distributed. The

indepen-dence assumption simply implies that the one-year bond excess return this year, would be uncorrelated with the bond excess return next year. As the bond excess returns are usually serially correlated, the validity of this assumption need to be examined. There-fore, I carried out a simple Langrangian Multiplier test and found that for 2 to 4 year bond-excess returns during our sample period, this hypothesis is not rejected.

There is certainly correlation between the excess returns in January and February within a year. Therefore, we assume contemporaneously correlation between them. More precisely, let U := (u1, . . . , u12) and Ut be the tth row of U . We assume Ut to

have a covariance matrix of the form

Σ = E(Ut�Ut) =     σ1,1 · · · σ1,12 ... . .. ... σ12,1 · · · σ12,12     . (5.2)

Stacking Ri, βi and ui vertically, i.e.

R :=     R1 ... R12     , β• =     β1 ... β12     and u• =     u1 ... u12     , and let x be a diagonal block matrix of order 12n× 12k

x =        x1 x2 . .. x12        ,

we obtain a single equation representation

(37)

5.1. Predicting one-year Bond Excess Returns: A SUR Framework where Σ := E(uu�) =     σ2 1In · · · σ1,12In ... . .. ... σ12,1In · · · σ122 In     = Σ ⊗ In4, (5.4)

based on the assumptions made on U .

The matrix representation (5.3) is the SUR model that most of our analysis will be based on. Notice that, if Σ = 0, the 12 regressions in (5.1) are actually unrelated. In this case, it is equivalent to consider the 12 regressions in (5.1) separately. However, when Σ �= 0 or xi, i = 1, . . . , 12 are not all the same, there is a gain in efficiency for

(5.3).

This model has the virtue that we could use all the information in our data rather than one series of non-overlapping data. Furthermore, by specifying the correlation structure, we could use the covariance matrix to use more a efficient estimation procedure than OLS. We will do so in the next section.

5.1.2. Feasible Generalized Least Squares Estimation

In order to estimate (5.3) by Feasible Generalized Least Squares (FGLS), we need to first estimate the covariance matrix Σ. Let ˆΣ denote the consistent estimator of Σ. According to Davidson and MacKinnon (2004), the i, jth element of ˆΣ can be obtained by ˆ σi,j = 1 nuˆ � iuˆj,

where ˆui is the OLS residuals from (5.1). The consistent estimator of Σ• is therefore

given by

ˆ

Σ = ˆΣ⊗ In.

Based on the estimate of Σ, the FGLS estimator of β is given by5

ˆ

β = (x�Σˆ−1 x)−1x�Σˆ−1 y. (5.5)

4

⊗” denotes the Kronecker product

5In fact, theoretically we could iterate the estimation procedure above to obtain a smore efficient

(38)

The FGLS estimates of ˆβ for different models are plotted in Figure 5.1. In this figure I plotted the FGLS estimates of ( ˆβi,2, . . . , ˆβi,k), i = 1, . . . , 12 for different models6. From

Figure 5.1, we could see that ˆβis uniformly display similar patterns.

5.2 The Time-varying Predictability in Bond Excess

Returns

As we have shown in Figure 5.1, the FGLS estimates of β seem to form similar struc-tures. But how do these similar shaped structures perform for different months through-out a year?

Write ˆβ as

ˆ

β = ( ˆβ1, . . . , ˆβ12)�,

and we could define a separate R2 statistics for each separate equation based on ˆβ i,

i = 1, . . . , 12. The separated R2

is are defined as the sum of squares (SSQ) of xiβˆi minus

the mean of Ri, ¯Ri say, divided by the sum of squares of Ri− xiβˆi.

R2i = SSQ(xiβˆi− ¯Ri) SSQ(Ri− xiβˆi)

, (5.6)

In this way, we could examine the predictive power of the predicting factor in month i, using the FGLS estimates ˆβi7. These separate R2is along with the R-squared for the

overall SUR system, are presented in Table 5.1. Furthermore, in order to illustrate the variation of the R2s throughout a year, R2

i, i = 1, . . . , 12 are plotted in Figure 5.2.

We could clearly see from the four figures in 5.2 that there are great variations between R2

i, i = 1, . . . , 12 for all the four models. Based on the construction of R2i, the variation

actually reveals the time-varying predictability in the bond excess returns.

Through a closer look at the patterns of the figures in 5.2, we could see that there is a difference between the SUR models using the level, slope and the curvatures of the

6The estimated parameter for the first factor (the constent term) for each β

i, i = 1, . . . , 12 is not

plotted in the picture, as their absolute value is much larger than the rest of the parameters, and therefore would make the pattern less clear.

7In our case, as we take into account the covariances Σ

• in our FGLS estimation, the R2is for the

(39)
(40)

yield curves to predict, and the SUR model using the one to five- year forward rates as predictive factors.

Figures 5.2a, 5.2b and 5.2c all display a similar ”smile-like” pattern: R2

1, as well as R211

and R2

12 are relatively higher than the others. In particular, as we could observe from

Table 5.1, the R2

1 for the models using Svensson factors and the Legendre polynomials

have values of 0.51 and 0.53, respectively. They are actually also the largest values in the whole table. The AFNS factors, on the other hand, appear to have higher predictive power in November, December than in January. As we have mentioned in Section 1.2, there exists a “January effect” as well as a “Turn of the year effect” in the stock market as well as the corporate bond market. The “smile” pattern in Figures 5.2a, 5.2b and 5.2c, seems to suggest that there also exist such effects in the government bond market, namely, in the predictability of the government bond excess returns. In the subsequent sections, I will investigate the existence of such effects more thoroughly by testing this hypothesis statistically.

In contrast, the Cochrane-Piazessi-SUR model displays a more complicated pattern. As we could see from 5.2d, though R2

1, R211 and R212 are higher than average, one to

five-year forward rates in April and September seem to have the strongest predictive power.

In addition, in Figure 5.2, I also plotted the separate R2s based on the OLS estimates

of βi, instead of the FGLS estimates ˆβi. It is apparent that the dotted lines, which are

R2s based on the OLS estimates, are above the solid lines. This pattern is caused by

the covariance structure (5.4) in the estimation process, which in fact, poses constraints on the estimates. On the other hand, from this fact we could also infer that the serial correlations of the bond excess returns will result in over-estimation in the predictability of bond excess returns8.

From Table 5.1, we could also observe that, though the Cochrane and Piazessi model still takes the lead in R2

SU R, its advantage is smaller, compared to the R2 based on

overlapping data shown in Table 4.1. In particular, the four Svensson factors, level, slope and two curvature factors, together, produce an R2

SU R of 0.37, which is the closest

to the R2

SU R found for the five forward rates.

8Thornton and Valente (2009) showed explicitly that the predictive power of the Cochrane and Piazess

(41)

5.3. Testing Seasonality Based on the SUR model

5.3 Testing Seasonality Based on the SUR model

The previous section showed the presence of the seasonal effect in the predictability of bond excess returns via the graphs of the R2

i, i = 1, . . . , 12. In fact, the seasonal effect

is more evident when we look at the variances of the prediction errors. The prediction error is defined as the realized bond excess return Ri minus the predicted bond excess

return, ˆRi = xiβˆi. In our case, the variances of the prediction errors are simply σi2,

i = 1, . . . , 12 , the diagonal elements of the variance-covariance matrix in (5.4). Its consistent estimate is defined as ˆσ2

i and is given by ˆ σ2i = 1 nuˆ � iuˆi,

where ˆui is the OLS residuals from (5.1). Apparently, a low value of ˆσ2i reflects a

relatively high prediction precision and vice versa. Figures 5.3 shows the ˆσ2

i, i = 1, . . . , 12

for different models.

The patterns of the plots in Figure 5.3 are in accordance with the patterns in Figure 5.2, and even more persuasive in suggesting a seasonal effect in the excess return pre-dictability. High values of R2

i in Figure 5.2 correspond to low values of ˆσi2. Furthermore,

panels (a), (b) and (c) in Figure 5.3 (correspond to the AFNS, Svensson and Legendre polynomials, respectively), display very low values of variances of prediction errors for excess returns in January, November and December, but also show higher values of ˆσ2

i,

for i = 2, . . . , 10. On the other hand, the Cochrane and Piazessi model, whose estimated variances of prediction errors are plotted in panel (d), reaches its lowest value in April. As the seasonal effects are reflected more clearly in Figure 5.3, we would like to test the presence of seasonality more formally based on the variances of the prediction errors in the following subsections.

5.3.1. Restricted Covariance Matrix

(42)

covariance matrix, which has only two unknown parameters while effectively captures the form of the unrestricted covariance matrix.

Let ΣR denote the restricted version of the variance-covariance matrix of Σ defined

in (5.2). Let ΣR be given by

ΣR= λ1Γ1+ λ2Γ2, λ1 > 0, (5.7)

where λ19 and λ2 are unknown scalar parameters and Γ1 and Γ2 are defined as follows.

Γ1 is a predefined 12× 12 parameter matrix that forms the correlation matrix of rows

of Ut defined in the paragraph above (5.2)10. I assume that it is given by

Γ1 =        1 1112 . . . 121 11 12 1 . . . 2 12 ... ... ... ... 1 12 2 12 . . . 1        .

Γ2 is an additional structure that we pose to capture the seasonal effect. As the patterns

in Figure 5.3 suggests, the variances of the prediction errors are particularly low in January, December and November. Therefore, we propose Γ2: 12× 12, to be of the form

Γ2 =        0 . . . 0 0 ... Γ˜ ... ... 0 . . . 0 0 0 . . . 0 0        with ˜Γ =        1 89 . . . 19 8 9 1 . . . 2 9 ... ... ... ... 1 9 2 9 . . . 1        . (5.8) ˜

Γ: 9× 9, effectively loads extra positive values to the variances and covariances to the months from February to October, given that λ2 > 011.

By stacking the columns of Γ1, Γ2 and ΣR, we could estimate the parameters λ1 and

λ2 by OLS, based on the following regression:

vec(ΣR) = λ1vec(Γ1) + λ2vec(Γ2), (5.9)

9λ

1 is defined to be positive as it simply represents the smallest value of the variance among σi2,

i = 1, . . . , 12. In the empirical application, this positivity of λ1is ensured by estimating λ21, instead. 10The form of the matrix is fixed based on the assumptions that , in (5.3), each independent element

in ui follows a MA(11) process and xi, i = 1, . . . , 12 are exogenous. A detailed derivation of such a

matrix is provided in Appendix.

11Notice that the value of λ

2 can be negative, which is a necessary condition for the validate our

(43)

5.3. Testing Seasonality Based on the SUR model where “vec(.)” is the operator that stacks the columns of a matrix vertically. So vec(Σ), vec(Γ1) and vec(Γ2) are all 144× 1 vectors. Let ˆλ1 and ˆλ2 denote the OLS estimators of

λ1 and λ2, respectively. Based on estimates of λ112 and λ2, we could obtain an estimates

for ΣR,

ˆ

ΣR= ˆλ1Γ1+ ˆλ2Γ2, (5.10)

which will be used as the initial estimates for ΣRin the maximum-likelihood estimation.

Let ΣR

• denote the restricted version of Σ•, the variance and covariance matrix of the

SUR model. Its estimates ˆΣR

•, say, is therefore given by ˆΣR⊗ In.

5.3.2. Maximum-likelihood Estimation

Assume that u follows a normal distribution. We have

R = xβ+ u, u ∼ N (0, Σ•). (5.11)

The likelihood function is then

f (β, Σ|R, x) =|2πΣ|−12exp � −(R•− x•β•)�(Σ•)−1(R•− x•β•) 2 � , so the log-likelihood function is therefore given by

L(β•, Σ•|R•, x•) =− 12n 2 ln(2π)− n 2ln|Σ| − 1 2(R•− x•β•) � •)−1(R•− x•β•). (5.12)

Since we are using the restricted variance-covariance matrix ΣR

• in our estimation, we

consider the alternative log-likelihood function with ΣR •, L(β•, ΣR•|R•, x•) =− 12n 2 ln(2π)− n 2ln|ΣR|− 1 2(R•−x•β•) �R •)−1(R•−x•β•). (5.13) As β, ΣR

• are unknown in (5.13), we will maximize the log-likelihood function by first

concentrating the likelihood.

12The value of the estimates is positive in our case, so we are able to use OLS without posing extra

(44)

Differentiation with respect to λ1 and λ2

First, I differentiate the log-likelihood function with respect to λ1. Because ΣR is defined

as a function of unknown parameters λ1 and λ2, it can be written as ΣR(λ1, λ2). I can

use the ”chain rule” to proceed: ∂L(β, ΣR •(λ1, λ2)) ∂λ1 = ∂L(β•, Σ R •(λ1, λ2)) ∂vec� R(λ1, λ2)) · ∂vec(ΣR(λ1, λ2)) ∂λ1 . (5.14)

The first partial derivative ∂L(β•,ΣR•(λ1,λ2))

∂vec�R12)) can be derived as follows. Since there are two

terms in (5.13) that contain ΣR, namely

12n 2 ln|ΣR| and 1 2(R•− x•β•) �R •)−1(R•− x•β•),

we will write out their partial derivatives, separately.

For the first term, its partial derivative with respect to vec�

R) can be written as13 ∂(ln|ΣR|) ∂vec� R) = 1 |ΣR| ∂|ΣR| ∂vec� R) = 1 |ΣR||ΣR|vec �−1 R ) = vec�(Σ−1R ). (5.15)

As for the second term, it is more convenient to first write it as the alternative form as follows before we derive its partial derivative. Let U (β): 12n× 1, be given by

U (β) := ((R1− x1β1)�, . . . , (R12− x12β12)�)�, then 1 2(R•− x•β•) �R •)−1(R•− x•β•) = vec�(U (β))(Σ−1R ⊗ In)vec(U (β•)) = �vec�(U (β•))(Σ− 1 2 R ⊗ In) � � (Σ−12 R ⊗ In)vec(U (β•)) � = vec�(Σ−12 R U�(β•))vec(U (β•)Σ− 1 2 R ) = trace(Σ−12 R U�(β•)U (β•)Σ −12 R ) = trace(Σ−1R U�(β)U (β)). (5.16)

(45)

5.3. Testing Seasonality Based on the SUR model Taking partial derivative of (5.16) with respect to vec�(ΣR), and suppressing the

argu-ments, we get ∂Σ−1R U� •)U (β•) ∂vec� R) = vec�(I12) ∂vec(Σ−1R U�U ) ∂vec� R) = vec�(I12)(U�U ⊗ I12) ∂vec(Σ−1R ) ∂vec� R) = vec�(I12)(U�U ⊗ I12)(−ΣR⊗ Σ−1R ) = −{(Σ−1R U�U Σ−1R )vec(Ig)}� = −vec�(Σ−1R U�U Σ−1R ). (5.17)

The second partial derivative of the right-hand-side of (5.14) is given by ∂vec(ΣR(λ1, λ2))

∂λ1

= ∂vec(λ1Γ1+ λ2Γ2) ∂λ1

= vec(Γ1). (5.18)

From (5.15), (5.17) and (5.18), it follows that ∂L(β, ΣR •(λ1, λ2)) ∂λ1 = � −n 2vec �−1 R ) + 1 2vec ��Σ−1 R U�(β•)U (β•)Σ−1R �� vec(Γ1) = n 2trace � (Σ−1R )�Γ1 � + 1 2trace �� Σ−1R U�(β)U (β)Σ−1R �Γ1 � = n 2trace �� (λ1Γ1+ λ2Γ2)−1 � Γ1 � +1 2trace �� (λ1Γ1+ λ2Γ2)−1U�(β•)U (β•)(λ1Γ1+ λ2Γ2)−1 � Γ1 � = −n 2 · 1 λ1 trace � (Γ1+ λ2 λ1 Γ2)−1Γ1 � (5.19) +1 2 · 1 λ2 1 trace �� (Γ1+ λ2 λ1 Γ2)−1U�(β•)U (β•)(Γ1 + λ2 λ1 Γ2)−1 � Γ1 � .

Similarly, when differentiating L(β•, ΣR•(λ1, λ2)) with respect to λ2, we only need to

change vec(Γ1) in (5.18) into vec(Γ2). Following similar steps that lead to (5.19), we

obtain the partial derivative of the log-likelihood function with respect to λ2:

(46)

Set (5.19) and (5.20) equal to zero and we obtain − n 2 · 1 λ1 trace � (Γ1 + λ2 λ1 Γ2)−1Γ1 � + 1 2· 1 λ2 1 trace �� (Γ1+ λ2 λ1 Γ2)−1U�(β•)U (β•)(Γ1+ λ2 λ1 Γ2)−1 � Γ1 � = 0; (5.21) − n2 · λ1 1 trace � (Γ1 + λ2 λ1 Γ2)−1Γ2 � + 1 2· 1 λ2 1 trace �� (Γ1+ λ2 λ1 Γ2)−1U�(β•)U (β•)(Γ1+ λ2 λ1 Γ2)−1 � Γ2 � = 0. (5.22)

Given a value for β, (5.21) and (5.22) constitute two normal equations with two un-knowns.

Instead of solving this system of equations directly, I take the following approach. First, from (5.21), I write λ1 as a function of the ratio of λ2 and λ1,

λ1 = trace��(Γ1+ λλ21Γ2)−1U�(β•)U (β•)(Γ1+ λλ21Γ2)−1 � Γ1 � ntrace�(Γ1+λλ21Γ2)−1Γ1 � . (5.23)

Second, I divide (5.21) by (5.22) and obtain

trace��(Γ1+λλ21Γ2)−1U�(β•)U (β•)(Γ1+ λλ21Γ2)−1 � Γ2 � trace��(Γ1+λλ21Γ2)−1U�(β•)U (β•)(Γ1+ λλ21Γ2)−1 � Γ1 � = trace � (Γ1+λλ21Γ2)−1Γ2 � trace�(Γ1+λλ21Γ2)−1Γ1 �. (5.24) Regarding λ2/λ1 as an unknown, I solve it numerically14 from (5.24). After a solution

for λ2/λ1 is obtained, I could easily find λ1 by (5.23).

Notice that, the restricted variance-covariance matrix ΣR

•(λ1, λ2) has to be positive

semi-definite. Consequently, we have to pose restrictions on the parameters λ1 and λ2.

Since λ1 and λλ12 are viewed as the unknowns, I will pose restrictions on them. I write

ΣR 1, λ2) as ΣR(λ1, λ2) = λ1 � Γ1+ λ2 λ1 Γ2 � . Since λ1 is defined to be positive, λλ21 has to satisfy

Γ1+

λ2

λ1

Γ2 ≥ 0. (5.25)

(47)

5.3. Testing Seasonality Based on the SUR model For presentation purpose, let

γ := λ2 λ1

,

and I will derive the bound for γ analytically as follows.

Let ei, i = 1, . . . , 12 denote the ith column of a 12× 12 identity matrix. Moreover,

define PΓ2 as a 12× 12 permutation matrix that is given by

PΓ2 =

e2, e3, e4, . . . , e10, e1, e11, e12

� . Pre- and Post-multiplying (5.25) by P�

Γ2 and PΓ2, respectively, an equivalent condition

of (5.25) is given by ΓP 1,1, ΓP1,2 ΓP 2,1, ΓP2,2 � + γ � ˜ Γ, 0 0, 0 � ≥ 0, (5.26) where ΓP

i,j, i = 1, 2 and j = 1, 2 are the four blocks of the permuted Γ1: 12× 12 matrix.

In particular, ΓP

1,1 is a 9×9 square matrix and the other blocks are of appropriate orders.

Furthermore, ˜Γ : 9× 9 is defined in (5.8). Continuing from (5.26), we have15

γ ˜Γ≥ ΓP 1,1− ΓP1,2 � ΓP2,2 �−1 ΓP2,1, so γI9 ≥ ˜Γ− 1 2 � ΓP1,1− ΓP 1,2 � ΓP2,2�−1ΓP2,1�Γ˜−12.

Consequently, the bound of γ is given by γ ≥ λmin � ˜ Γ−12 � ΓP1,1− ΓP1,2�ΓP2,2�−1ΓP2,1�Γ˜−12 � = λmin � ˜ Γ−1�ΓP1,1− ΓP1,2�ΓP2,2�−1ΓP2,1��, (5.27) where λmin{.} denotes the smallest eigenvalue of the matrix inside the braces.

Differentiation with respect to β

Notice that maximizing (5.13) is equivalent to minimizing min

β (R•− x•β•) �R

•)−1(R•− x•β•).

For a given ΣR

•, this is a similar problem that we consider to obtain the FGLS estimator

defined in (5.5). Therefore, an estimate for β, given an estimate for ΣR

•(λ1, λ2) is of

the similar form as the FGLS estimator, given by ˆ β = � x��ΣR(ˆλ1, ˆλ2) �−1 x �−1 x��ΣR(ˆλ1, ˆλ2) �−1 R. (5.28)

(48)

An iterative procedure

The maximum-likelihood estimator is obtained via an iterative procedure, which is sum-marized as follows:

1. Start the procedure by using the OLS estimates for λ1 and λ2 in (5.10) to get an

initial estimates for ΣR

•, say ΣR•(ˆλ1, ˆλ2);

2. Use ΣR

•(ˆλ1, ˆλ2) to obtain an initial estimate for β• via (5.28);

3. Substitute ˆβ into (5.24) and solve it to get an estimate for λ2

λ1 . Subsequently, get

an estimate for λ1 based on (5.23). As a result, a new estimate for ΣR(λ1, λ2) is

obtained.

4. Go to step 1 and iterate the procedure till convergence16. The resulting estimates

are the maximum-likelihood estimates.

5. Compute the bounds for λ2 based on (5.27) and check whether the

maximum-likelihood estimates for λ1 and λ2 are within the bounds.

5.3.3. A Likelihood Ratio Test

We apply a likelihood ratio test to verify the existence of a seasonality effect. As pre-sented in Section 5.3.1, the null hypothesis of the test is

H0 : λ2 = 0.

Setting λ2 = 0, the restricted covariance under the null hypothesis is given by

ΣRM L = λR1Γ1.

Following the iterative procedure introduced in the previous subsection, we could obtain the restricted maximum-likelihood estimates, βRM Lˆ

• and ˆΣRM L• , say. Consequently, the

likelihood ratio test statistic can be written as

LR = 2×L( ˆβRM L, ˆΣRM L )− L( ˆβM L, ˆΣM L )�, (5.29) where ˆβM L

• and ˆΣM L• are maximum-likelihood estimates obtained via the iterative

pro-cedure defined in Section 5.3.2. UnderH0, LR follows asymptotically a X2 distribution

16In my application, I terminate the procedure when the absolute value of the change of two subsequent

(49)

5.3. Testing Seasonality Based on the SUR model with one degree of freedom. The testing results, as well as the ML estimates for (ˆλM L

1 )2

and ˆλM L

2 are summarized in Table 5.2.

Table 5.2.: Likelihood ratio testing results Model λˆM L 1 λˆM L2 λˆRM L1 LR p-value Svensson 1.61 1.13 2.86 14.96 0.0001 AFNS 1.82 0.89 2.78 8.69 0.0032 Legendre 1.61 0.82 2.70 21.67 0.0000 CP 1.52 0.57 2.10 1.29 0.2553

Notes: This table summarizes the results of the likelihood ratio tests for different models considered in this thesis and also lists the unrestricted maximum-likelihood estimates for λ1 and λ2 as well as the restricted estimates ˆλRM L1 . Notice that “Svensson” is short for

the Svensson model, “AFNS”, “Legendre” and “CP” represent Arbitrage-free Nelson-Siegel, Legendre polynomials and Cochrane and Piazessi models. Notice in particular that for the restricted covariance structure, in this test I used Γ1,11,122 .

As shown in Table 5.2, for the four models, the null hypothesis is rejected for the Svensson, AFNS and the Legendre model at the level of 5%. We can not reject H0,

however, for the Cochrane and Piazessi model. These results, confirm our observations in Figure 5.3, where seasonal effects seem to be more pronounced for Svensson, AFNS and the Legendre model. These tests also suggest that there exists a seasonality effect in predictability of bond excess returns. Furthermore, as we could observe from the third column of the table, the maximum likelihood estimates for λ2 are all positive, even for

the Cochrane and Piazessi model. The positive estimates are in accordance with Figure 5.3, where the variances are higher from February to September compared to the other months of the year. Thus, the testing results confirm our hypothesis that there is less noise in the bond excess returns in January, October to December.

(50)
(51)

5.3. Testing Seasonality Based on the SUR model

(a) ˆβ for AFNS-SUR

(b) ˆβ• for Svensson-SUR

Figure 5.1.: Estimates of ˆβ for different models

(52)

(c) ˆβ• for Legendre-SUR

(d) ˆβ• for Cochrane-Piazessi-SUR

Figure 5.1.: Estimates of ˆβ for different models

(53)
(54)

Referenties

GERELATEERDE DOCUMENTEN

TREC Temporal Summarization (TS) task facilitates research in monitoring and summarization of information associated with an event over time. It encourages the development of

2 In the next section we will show that the technique expounded in the present section for the analysis of the rate of convergence of a finite, continuous-time Markov chain

Is the DOW-effect present in returns that are adjusted to the market beta, market capitalization and book-to-market ratio of firms listed on the Dutch

First, the yield curves of Germany and the UK are modelled with the Nelson-Siegel (NS) curve. As mentioned earlier, the yield curve is analyzed in terms of level, slope and

Also, we know from section 2 that we expect to find a significant relationship between economic growth in the fourth quarter and the corresponding one-year-ahead excess returns, but

As the weather variables are no longer significantly related to AScX returns while using all the observations, it is not expected to observe a significant relationship

Fourth, the results in the economy-wide model show that internationalization, as measured by the relative share of foreign sales and assets, speeds up the convergence of

Because the macroeconomic control variables are also country specific we separate the macroeconomic variables in for the United States and for Germany and add