### MSc Stochastics and Financial Mathematics and MSc Econometrics

### Master Thesis

### Probabilistic Modelling of Implementation Shortfall in Equity Trading

Author: Supervisors:

### Bud Schiphorst Prof. dr. H.P. Boswijk dr. ir. E.M.M. Winands

Examination date: Daily supervisors:

### December 21, 2021 dr. M. van der Schans

### drs. W. Tilgenberg

### Statement of Originality

This document is written by Student Bud Schiphorst who declares to take full responsi- bility for the contents of this document. I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it. UvA Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

### Abstract

Stock price movements cause costs in equity trading, which are measured by implemen- tation shortfall (IS). These costs are partly driven by market impact of meta-orders.

This research investigates what probabilistic model is appropriate for forecasting IS.

Properties of the distribution of IS are first theoretically computed in a continuous (volume) time propagator model and then empirically assessed. Several point forecasting models are compared in a cross-validation assessment: none of the considered models outperform the popular Square-Root Law market impact model. Density forecasting models are systematically assessed and compared using scoring rules and probability in- tegral transformed (PIT) observations: it is found that forecasts improve when modelling with a leptokurtic, i.e. non-Gaussian, distribution. Also, the variance of IS is found to scale linearly with the volume time trade duration and a novel approach is proposed for incorporating this dependence in pre-trade density forecasts.

Keywords: meta-order, market impact, implementation shortfall, propagator model, volume time, square-root law, density forecasting, latent variable.

Title: Probabilistic Modelling of Implementation Shortfall in Equity Trading Author: Bud Schiphorst, budschiphorst@student.uva.nl, 10763945

Supervisors: Prof. dr. H.P. Boswijk, dr. ir. E.M.M. Winands Daily supervisors: dr. M. van der Schans, drs. W. Tilgenberg Second Examiners: Prof. dr. C.G.H. Diks, Prof. dr. P.J.C. Spreij Examination date: December 21, 2021

Korteweg-de Vries Institute for Mathematics University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam http://kdvi.uva.nl

Amsterdam School of Economics University of Amsterdam

Roetersstraat 11, 1018 WB Amsterdam https://ase.uva.nl/

Robeco Institutional Asset Management B.V.

Weena 850, 3014 DA Rotterdam https://www.robeco.com/en/

## Contents

1. Introduction 7

1.1. Practical Applications . . . 7

1.2. Brief Literature Overview . . . 8

1.2.1. Market Impact . . . 8

1.2.2. Probabilistic Modelling of IS . . . 9

1.3. Research Questions . . . 9

1.4. Outline and Contributions . . . 10

2. Theoretical Framework 11 2.1. Order Execution . . . 11

2.1.1. Meta-orders and Child Orders . . . 11

2.1.2. Execution Strategies . . . 12

2.2. IS and Market Impact . . . 14

2.2.1. Meta-order IS . . . 14

2.2.2. Relative Order Size, Participation Rate and Duration . . . 14

2.2.3. Square-Root Law . . . 15

2.3. IS in the Propagator Model . . . 16

2.3.1. General Setting . . . 16

2.3.2. Heuristic model . . . 17

2.4. Conclusion . . . 19

3. Data and Empirical Analysis 21 3.1. Data . . . 21

3.1.1. Variables . . . 21

3.1.2. Filters . . . 22

3.1.3. Statistics . . . 23

3.2. Empirical Analysis of Heuristic Model . . . 25

3.2.1. Square-Root Law . . . 25

3.2.2. Variance Scaling . . . 27

3.2.3. Gaussian Distribution . . . 28

3.3. Conclusion . . . 29

4. Regression Modelling 30 4.1. Considered models . . . 30

4.1.1. Proposed in Literature . . . 30

4.1.2. Additional Benchmark Models . . . 31

4.1.3. Non-parametric Benchmarks . . . 31

4.1.4. Additional Characteristics . . . 32

4.2. Estimation and Assessment . . . 33

4.2.1. Estimation . . . 33

4.2.2. Model Assessment . . . 33

4.3. Results . . . 35

4.3.1. Forecast Performances . . . 35

4.3.2. Estimated Parameters . . . 36

4.4. Conclusion . . . 36

5. Assessment of Density Models 38 5.1. Comparative Assessment . . . 38

5.1.1. Scoring Rules . . . 38

5.1.2. Comparative Assessment with Scoring Rules . . . 40

5.2. Absolute Assessment . . . 41

5.2.1. Probability Integral Transformed (PIT) Observations . . . 41

5.3. Conclusion . . . 42

6. Density Modelling 43 6.1. Considered Models . . . 43

6.1.1. Mean Specification . . . 43

6.1.2. Distribution Specification . . . 44

6.1.3. Variance Specification . . . 44

6.1.4. Latent Duration . . . 45

6.2. Results . . . 46

6.3. Conclusion . . . 50

7. Conclusion 51 7.1. Research Questions . . . 51

7.2. Contributions and Implications . . . 52

7.3. Further Research . . . 53

Popular summary 55 Bibliography 56 A. Appendix: Derivations. 59 A.1. Theoretical Framework Derivations. . . 59

A.1.1. General Setting . . . 59

A.1.2. Heuristic Model . . . 60

A.1.3. Discrete time Heuristic IS Variance . . . 63

A.2. Invariance of Logarithmic Scoring Rules . . . 65

B. Appendix: Details 66 B.1. Linear Spline . . . 66

B.2. LOWESS estimator . . . 67

B.3. Cross-Validation Testing . . . 67

B.4. Density Specifications . . . 68

B.5. Density Forecasting with Latent Duration T . . . 69

B.5.1. Theoretical Motivation . . . 69

B.5.2. Practical Specifics . . . 70

B.5.3. Alternative Approaches . . . 70

C. Appendix: Figures and Tables 72 C.1. Extra Regression Variables . . . 72

C.2. Logarithmic Mean in Density Forecast . . . 73

C.3. Extra Density Variables . . . 73

## 1. Introduction

Trading costs are a crucial topic for equity portfolio managers because they account for a substantial part of profits and losses (Bucci, Mastromatteo, et al., 2019). Investment strategies in large portfolios that seemingly achieve abnormal excess returns on paper, might have significantly lower actual returns when taking trading costs into account (Briere et al.,2019). Portfolio managers hence use models to predict trading costs pre- trade and to evaluate realized trading results post-trade (Bershova & Rakhlin, 2013).

Some trading costs are relatively easy to model and predict, for example: commissions, fees, taxes and spread costs. In contrast, it is much more difficult to model and predict costs incurred due to stock price movements that occur during order execution.

A buy (sell) order refers to the decision of buying (selling) a specified number of shares of a stock and order execution refers to the actual buying (selling) of the shares. The price at which an order is decided is generally different from the price at which the order is executed. The difference between decision and execution price can be characterized by implementation shortfall (IS), which is defined as

IS = ε ·[Execution Price] − [Decision Price]

[Decision Price] , (1.1)

where ε equals 1 for a buy order and -1 for a sell order. A higher IS can be interpreted as a more costly execution of an order.

IS is driven by both market noise and market impact. Market noise refers to the random price fluctuations over time that are unrelated to the order, whereas market impact refers to the concept that buy (sell) orders drive the market price up (down).

Market noise can yield both positive or negative price changes, which are expected to
average out over many orders. On the other hand, market impact can only drive the
price to be more unfavourable for execution^{1}. For a single order, IS is dominated mostly
by market noise and can therefore be both positive or negative. However, IS averaged
over many orders is usually positive due to the market impact component.

This thesis focuses on developing a probabilistic model of IS that is appropriate for forecasting trading costs. Some possible practical applications are briefly described.

### 1.1. Practical Applications

Investigating and forecasting expected trading costs is the most popular application of modelling IS. Since market noise is expected to average out over many orders, this is equivalent to modelling and forecasting market impact. This is useful for investigating the average performance of investment strategies in the presence of trading costs.

1e.g. the market impact of a buy order drives up the price and thus increases the cost of execution.

As in any application of point estimation or forecasting, it is of course important to provide some quantitative measure of uncertainty. This is especially true if important investment decisions are based on the forecasts. A probabilistic model of IS can for example produce a prediction interval for trading costs, rather than only a point forecast.

Also, portfolio managers are often not interested only in average performance, but also in the risk of investment strategies: for example, a natural decision might be to avoid orders that have a high risk of incurring large trading costs. Practical applications of risk management benefit from an appropriate probabilistic model of IS.

Finally, portfolio managers often analyse and compare investment strategies based on their performance in a controlled portfolio simulation. A probabilistic model of IS can provide a more realistic simulation of performance in the presence of trading costs.

### 1.2. Brief Literature Overview

This section provides a brief overview of existing literature on modelling market impact and implementation shortfall that is relevant to this thesis. See Lillo(2021) for a more complete overview of the literature on modelling price formation in financial markets.

1.2.1. Market Impact

Modelling market impact dates back to the foundational paper by Kyle (1985), which predicts market impact to be a linear function of order size. Later studies consistently conclude that market impact is instead a non-linear concave function of order size and is well approximated by the empirically observed ”Square-Root Law” (Lillo,2021). The latter roughly states that the market impact of an order is proportional to the square root of order size. Donier et al. (2015) argue that it even appears that the Square-Root Law is universal; as it has been empirically observed in markets for different financial assets, in different geographical locations, in different time periods and even for relatively new markets (e.g. in Bitcoin markets).

Because of the simplicity and the consistent empirical verification of the Square-Root Law model, it remains the most popular market impact model among both practitioners and researchers. Nevertheless, many alternative models have been proposed over the years. For example, Zarinelli et al.(2015) argue for a logarithmic relationship between market impact and order size, whereas Bucci, Benzaquen, et al. (2019) argue that the relationship is instead best described by a cross-over from linear to square root. Some of the proposed alternative models are motivated by their performance in point forecasting trading costs, e.g. by having a lower Root Mean Square Error (RMSE).

Additional to the empirical literature, there is also literature on market micro-structure
and price formation that focuses on theoretical models for market impact. Theoretical
models proposed in such literature often aim to be consistent with, among other things^{2},
the empirically observed Square-Root Law. Contrary to the consistent agreement on
the empirical Square-Root Law, there does not (yet) seem to be a universal theoret-
ical framework of market impact to which most researchers agree. Notable branches

2Other important properties of a theoretical price formation model include: the absence of arbitrage, the existence of equilibria and the existence of optimal trading strategies.

in the literature include: Propagator/Transient Impact models (e.g. Bouchaud et al., 2003;Gatheral,2010; Curato et al., 2017), Locally Linear Order Book (LLOB) models (e.g. Donier et al., 2015; Bucci, Benzaquen, et al., 2019) and models based on Market Microstructure Invariance (e.g. Kyle & Obizhaeva,2016).

1.2.2. Probabilistic Modelling of IS

There exists relatively little literature on probabilistic modelling of IS: both empirical and theoretical literature often focus on only modelling expected IS, i.e. market impact. For example, the theoretical literature on propagator models explicitly models a stochastic stock price, but still focuses on expected IS and deterministic trading strategies.

Bikker et al. (2010) investigate what factors increase the risk in IS, by estimating a quantile (linear) regression model. They assess the quantile regression forecasts in a Value-at-Risk framework that is similar to that of Christoffersen(1998).

Briere et al.(2020) model the distribution of IS using a Bayesian network model. Their approach easily allows for modelling a non-Gaussian distribution with non-constant vari- ance and for incorporating latent variables. In particular, they find that latent modelling of order book volume imbalances improves point forecasts of IS. They assess their model using point forecast prediction error measures.

Similarly,Markov(2019) models the distribution of IS in a Bayesian regression frame- work, with the aim of ranking of broker trading performances. Their proposed model is only assessed by a visual comparison of the modelled predictive distribution and the empirically observed distribution of IS.

### 1.3. Research Questions

The overall aim of this research is to develop a probabilistic model that is appropriate for producing forecasts of implementation shortfall (IS). This is characterized by the following research question:

▶ What is the most appropriate probabilistic model for forecasting IS?

Sub-questions

Attempting to answer this research question naturally induces some sub-questions. For instance, one that considers the theoretical motivation for the empirical models:

(Q.1) ▷ What theoretical framework is appropriate for modelling IS?

Given the extensive amount of literature on market impact modelling, one sub-question considers point forecasting of expected average IS:

(Q.2) ▷ What model appropriately forecasts average IS, i.e. market impact?

In particular, it is investigated whether alternative market impact models improve on the Square-Root Law.

Existing literature on probabilistic modelling of IS does not always use a systematic model assessment approach. A sub-question that addresses this is given by

(Q.3) ▷ How can probabilistic forecasts of IS be appropriately assessed and compared?

Finally, the last sub-question directly relates to the main research question:

(Q.4) ▷ What model produces the best probabilistic forecasts of IS?

### 1.4. Outline and Contributions

The first sub-question (Q.1) is addressed in Chapters 2 & 3. The second sub-question (Q.2) is investigated in Chapter4. The third sub-question (Q.3) is addressed in Chapter5 and the fourth sub-question (Q.4) is addressed in Chapter6.

Chapter 2 presents a theoretical framework for probabilistic modelling of IS, which builds further on earlier literature on continuous-time propagator models (e.g.Gatheral

& Schied,2013). As a contribution to the literature, the variance of IS is derived in the models. Also, the practical convention of modelling in a volume time clock is theoretically motivated by noting the link to subordinated stock price processes as inClark(1973).

Chapter 3briefly describes the available data set and analyses the empirical fit of the heuristic theoretical model introduced in Chapter2. As a main contribution, the variance of IS is shown to have an empirical linear scaling with volume time duration. Also, it is found that the empirical distribution of IS is leptokurtic and therefore non-Gaussian.

Chapter 4 assesses and compares empirical models for market impact, i.e. the (con- ditional) mean of IS, using a systematic cross-validation methodology. The popular Square-Root Law model is compared to alternative models proposed in the literature and to simple benchmark models. As a contribution to the literature, two non-parametric models are also considered. It is found that the considered models provide neither eco- nomically nor statistically significant improvements in point forecasting IS.

Chapter5outlines a framework for systematically assessing and comparing probabilis- tic models for IS based on density forecasts. In particular, it outlines the use of numerical scoring rules and Probability Integral Transforms (PIT) for the assessment of density forecasts. This systematic assessment framework can be considered a contribution to the (relatively limited) empirical literature on probabilistic modelling of IS.

Chapter6assesses and compares different probabilistic models for forecasting IS. It is found that incorporating the volume time variance scaling leads to significant improve- ments in density forecasts. As a contribution, a novel method is proposed for latent modelling of volume time duration for pre-trade density forecasts. Additionally, it is found that density forecasts are improved by considering leptokurtic distributions and more detailed variance specifications.

Finally, Chapter 7presents an overview of the results and their theoretical and prac- tical implications. The chapter concludes with some ideas for further research.

In short: the Square-Root law is found to be the most appropriate market impact model out of the considered models. Also, it is concluded that an appropriate probabilistic model should incorporate both a leptokurtic distribution and a variance that scales linearly with volume time duration.

## 2. Theoretical Framework

This chapter outlines a theoretical framework for probabilistic modelling of IS.

The first section introduces concepts that are relevant for understanding the execution of an order: such as meta-orders, order execution strategies and volume time clock. The second section introduces concepts that are relevant for modelling IS and market impact, such as: meta-order implementation shortfall (IS) and the Square-Root Law.

Using the introduced concepts, the third section outlines a continuous-time propagator
model for the stock price process in presence of a meta-order. Similar to earlier literature
(e.g. Gatheral & Schied, 2013), the expectation of IS is computed in the propagator
model. As a contribution to the literature, also the variance of IS is computed in the
continuous-time propagator model^{1}. The section concludes with a heuristic model based
on simplifying assumptions that are commonly found in the literature. In particular, the
heuristic model predicts the variance of IS to scale linearly with volume time duration,
which is verified in the empirical analysis in Chapter 3. In fact, it is found Chapter 6
that density forecasts of IS are significantly improved by incorporating this scaling.

### 2.1. Order Execution

This section introduces concepts that are important for understanding order execution.

2.1.1. Meta-orders and Child Orders

It is common practice among traders to split large orders in to multiple smaller orders, instead of trading them as one whole. The aim is to keep the total size of the order hidden from other market participants for as long as possible.

To avoid ambiguity, some terminology is introduced: the original larger orders are
referred to as meta-orders^{2}, whereas the smaller orders are referred to as child orders.

Similar to other literature, such asNadtochiy(2020), this research only considers meta- orders of which all the child orders are in the same buy/sell direction.

Bershova and Rakhlin (2013) argue that empirical research on market impact can be categorized into two groups: research that focuses on the market impact of single child orders and research that investigates the market impact of the complete meta-orders.

Earlier research indicates that child orders associated to the same meta-order do not have independent market impact. The latter implies that analysis on meta-order level is more appropriate. This is however not always possible as some databases may fail

1Additionally, a discrete-time derivation is provided in Appendix (A.1.3). The result in continuous-time follows as the limit of the discrete-time result as trading frequency increases to infinity.

2Some literature refers to meta-orders as parent orders.

to capture the meta-order dependence between observed child orders. This is because a meta-order can be executed over multiple days and/or via multiple brokers/exchanges.

The proprietary data set used in this research has full identification of the meta-order dependence between all child orders. Therefore this research will focus on analysis on a meta-order level.

2.1.2. Execution Strategies

The process of executing a meta-order over time interval [0, T ] can be characterized by an order execution strategy, which specifies the stock position X(t) held at each time t ∈ [0, T ]. As a practical convention, the stock position X starts in 0 and then becomes positive (negative) for a buy (sell) meta-order during the execution.

More specifically, let Q denote the absolute size of the meta-order. An execution
strategy X is then a monotone and absolutely continuous function of time, that satisfies^{3}

X(0) = 0, X(T ) = ε · Q, where ε :=

( 1 for a buy meta-order

−1 for a sell meta-order. (2.1)
In a discrete-time setting, one can characterize the number of traded stocks by the
difference in X between two time steps, i.e. ∆X(t_{n}) = X(t_{n})−X(t_{n−1}). In a continuous-
time setting, the change in stock position at time t time is instead best characterized by
the trading rate ˙X(t) given by^{4}

X(t) :=˙ dX

dt (t). (2.2)

A continuous-time meta-order execution can be interpreted as a continuum of infinitesi- mal child order executions. For mathematical convenience, the theoretical framework in this chapter will focus on continuous-time executions.

Volume Time Modelling

A common convention in the market impact literature is to represent the execution time
interval [0, T ] in volume clock time, rather than in chronological clock time. The volume
clock is an example of an event-based clock^{5}, in which time increases proportionally to
traded market volume. Intuitively, the volume clock moves faster when more market
volume is traded and slower when less market volume is traded. In particular, the
volume time duration of a meta-order execution equals the total market volume that
is traded during the execution (up to a scaling unit). To allow for the comparison of
different stocks and markets, the volume time duration is often scaled relative to the
average daily traded market volume of a stock.

Modelling in volume (clock) time rather than in chronological clock time has both practical and theoretical motivation, which are discussed in the next two sections.

3Note that Gatheral and Schied(2013) are interested in ”liquidating” positions and therefore switch the signs and boundary conditions.

4Note that absolute continuity only guarantees that the trading rate exists Lebesgue almost everywhere.

The latter is sufficient for the considered purposes.

5SeeEasley et al.(2012) for an accessible introduction to event-based clocks.

VWAP Execution Strategy

Traders often aim for an execution price that is close to the market Volume Weighted Average Price (VWAP) during the execution. The execution price exactly equals the VWAP for an execution strategy with a trading rate that is proportional to traded market volume. Therefore, such a strategy is called a VWAP execution strategy.

Curato et al.(2017) argue that most algorithmic meta-order executions in practice are
well-approximated by a VWAP execution strategy^{6}. Additionally, a VWAP strategy is
also the optimal risk-neutral execution strategy in the theoretical framework proposed
by Almgren and Chriss (2001).

Note that the trading rate of a VWAP strategy is proportional to the change in volume time, since both are proportional to the change in traded market volume. Because of that, a VWAP strategy X has a simple volume time representation, in which the trading rate is constant:

X(t) = (ε · Q) · t

T, X(t) = (ε · Q) ·˙ 1

T, (2.3)

where (ε · Q) and T represent signed order size and volume time duration respectively.

This representation of the VWAP strategy simplifies analytical derivations and is an important practical motivation for representing order executions in volume time.

Additional Motivation for Volume Time

Another practical motivation for modelling in volume time is that it automatically ex- cludes non-trading hours: e.g. night hours, weekends and holidays. When modelling in chronological clock time, one has to either manually exclude such non-trading hours or model them separately from the trading hours (see e.g. French & Roll, 1986; Jones et al.,1994;Cliff et al.,2008).

Additional to the practical motivations, modelling in the volume time clock also has
theoretical motivation. Almgren et al. (2005) argue that the level of market activity
varies substantially between different periods of the trading day and that this affects
both the market volume profile and the variance of prices^{7}. Since the price formation of
a stock is driven by the trading activity, it is reasonable to model the stock price process
using a time clock that is directly related to trading activity.

The idea of modelling the stock price process in a trading activity clock dates back to seminal papers byMandelbrot and Taylor(1967) andClark(1973). In particular,Clark (1973) proposes a model in which the stock price process is an arithmetic Brownian motion when considered in volume time. In a more recent paper, An´e and Geman (2000) argue that normality of stock price returns is better observed when modelling in a clock driven by the number of transactions rather than the volume of the transactions.

Howison and Lamper(2001) propose a simple model in which new information flow drives market activity, which then causes stochastic volatility in stock returns in chronological clock time.

6Based on discussions with traders, this also seems to hold for the meta-orders in the database provided by Robeco.

7In this sense, non-trading hours could be considered a period of no market volume/variance.

### 2.2. IS and Market Impact

This section outlines concepts that are important for modelling IS and market impact.

2.2.1. Meta-order IS

Some literature (e.g. Zarinelli et al., 2015; Bucci, Benzaquen, et al., 2019) focuses on modelling total market impact, which is measured by the average (log) difference between the price at the start and end of a meta-order execution. From the perspective of modelling trading costs, it is better to use a measure that incorporates the execution prices of all the child orders. These prices are observed during meta-order execution.

Therefore, empirical literature on market impact trading costs instead often considers
the implementation shortfall (IS), as introduced by Perold (1988)^{8}. The IS of a meta-
order is the relative difference between the realised execution price of the meta-order
and some benchmark price S_{ref}, where the realised execution price of a meta-order is
computed as the weighted average execution price of the child orders.

More specifically, consider a meta-order with absolute order size Q that is executed
over N child orders. The IS of the meta-order is then computed as^{9}

IS := ε
S_{ref}

N

X

n=1

|∆X(t_{n})|

Q · S(t_{n}) − S_{ref}

= PN

n=1[S(tn) − Sref] · ∆X(tn)

S_{ref}· Q , (2.4)

where S(t_{n}), ∆X(t_{n}) represent the execution price and signed order size of the nth child
order respectively and where ε is an indicator that equals 1 for a buy meta-order and -1
for a sell meta-order. The continuous-time analogue of IS then follows as

IS :=

RT

0 [S(t) − S_{ref}] dX(t)

S_{ref}· Q . (2.5)

In most literature, the benchmark price Sref is chosen equal to the stock price S(0) at the start of the execution. That convention is followed in this thesis. Alternative benchmark prices include the price at the moment of deciding the meta-order or the Volume Weighted Average Price (VWAP) over the time interval of execution.

2.2.2. Relative Order Size, Participation Rate and Duration

Different stocks generally have different market trading activity, which can be character- ized by the Average Daily (traded) Volume (ADV) of a stock. Early literature already concludes that the market impact of a meta-order is better described as a function of order size relative to ADV rather than of absolute order size (Almgren et al., 2005).

8SeeKissell et al.(2004) for an intuitive decomposition of IS into delay costs, opportunity costs and trading related costs.

9The second equality provides a rewritten form that is easier to generalize to continuous-time trading and follows from the fact thatPN

n=1|∆X(tn)| = Q and ∆X(tn) = ε · |∆X(tn)|.

Relative order size π of a meta-order is defined as the ratio of absolute order size Q and the stock’s ADV :

π := Q

ADV. (2.6)

The participation rate η measures how aggressive an order is executed relative to other market participants and is defined as the ratio of absolute order size Q and the traded market volume during execution VE:

η := Q VE

. (2.7)

The (volume time) duration T of a meta-order represents how long it takes to execute an order and is defined as the ratio of traded market volume during execution VE and the average daily traded volume ADV :

T := V_{E}

ADV. (2.8)

Note that, conditional on relative size π, the participation rate η and duration T are not independent: increasing the aggressiveness of a meta-order will lead to a shorter duration of the execution and vice versa. In fact, conditioned on relative size π, they are inversely proportional:

η = Q VE

= Q

ADV ·ADV VE

= π

T. (2.9)

2.2.3. Square-Root Law

Let σ denote the stock volatility measured as a percentage of the stock price and let π again denote the relative order size. The Square-Root Law model then specifies that the conditional expectation of IS, i.e. the market impact I, is proportional to the square-root of π and scales with σ (Zarinelli et al.,2015):

I(π, σ) := EIS|π, σ = α · σ ·√

π, α > 0. (2.10)

Alternatively, some literature considers the Square-Root Law to refer to the slightly
more general Power Law form^{10}

I(π, σ) := EIS|π, σ = α · σ · π^{δ}, α, δ > 0. (2.11)
Note that the Square-Root Law implies that, conditional on π, market impact does not
have further dependence on participation rate η or duration T . That is, they affect
market impact only via their product η · T = π. This implication has also been observed
empirically (e.g. byZarinelli et al.,2015;Bucci, Mastromatteo, et al.,2019). The latter
is especially useful for forecasting market impact costs, as the duration T of execution
of a meta-order is usually not known pre-trade (i.e. before the execution).

10Note that π^{δ}reduces to√

π for δ = 0.5. Zarinelli et al.(2015) mention that previous studies estimate δ to be between 0.4 and 0.7 and indeed typically very close to 0.5.

### 2.3. IS in the Propagator Model

This section considers a continuous-time propagator model for the stock price process
S^{X} = {S^{X}(t)}_{t≥0} in the presence of a meta-order. Note the explicit dependence on the
execution strategy X. The properties of IS are computed first in a (relatively) general
model and then in a simplified heuristic model.

The derivation of the expectation of IS in this model is similar to as in earlier liter- ature (such as Gatheral & Schied,2013;Curato et al.,2017). As a contribution to the literature, also the variance of IS is computed. For conciseness and readability of the main text, the full derivations are presented in Appendix Section A.1

2.3.1. General Setting

There exist several theoretical models that aim to explain empirical observations on market impact, but the continuous-time propagator model is one of few to explicitly specify a stochastic model for the stock price process.

Gatheral and Schied(2013) argue that for modelling the conditional stock price process
S^{X}, one often starts by first specifying the unaffected stock price process S^{0} = {S^{0}(t)}_{t≥0}.
The latter represents the stock price process in the absence of a meta-order, i.e. the case
in which X = 0. They also argue that it is a standard assumption in the market impact
literature to assume that S^{0} is a martingale^{11}.

Given the unaffected stock price process S^{0} and execution strategy X, the continuous-
time propagator model proposes that the conditional stock price process is given by

S^{X}(t) := S^{0}(t) +
Z t

0

f ( ˙X(s))G(t − s)ds, (2.12) where ˙X(s), f X(s) represent the trading rate and market impact at time s respectively˙ and where G(t − s) represents the decay of the market impact between time s and t. The function f (·) is referred to as the instantaneous market impact and G(·) is referred to as the decay kernel. Intuitively, the propagator model proposes that total market impact is the aggregation of market impacts in the past that are slowly dissipating over time.

Given the conditional stock price process S^{X}, the implementation shortfall (IS) of a
meta-order is given by^{12}:

IS :=

RT

0 [S^{X}(t) − S(0)]dX(t)

S(0) · Q . (2.13)

Combining the (2.13) with the properties of X and S^{0}, one can compute the expectation
and variance of IS. The results are presented below and the full derivations are provided
in Appendix (A.1.1).

11They argue that S^{0} being a semi-martingale is a reasonable assumption and that drift terms may
often be ignored due to the relatively short time horizons of executing a meta-order.

12Note that the definition of IS here is similar, but different, to the definition of liquidation costs in Gatheral and Schied(2013). In particular, IS is represented as a relative cost rather than absolute.

Expected Value (Market Impact)

It is shown in Appendix (A.1.1) that the expectation of IS, i.e. the market impact, equals

E[IS] = 1 S(0) · Q

Z T 0

Z t 0

f ( ˙X(s))G(t − s)ds dX(t). (2.14) As expected, market impact depends on instantaneous market impact f , time decay G and execution strategy X.

Notably however, it is independent of the unaffected stock price process S^{0}. Intuitively,
this means that the martingale market noise has no effect on average IS. The latter is a
result of the additive assumption in (2.12).

Variance

It is also shown in Appendix (A.1.1) that the variance of IS equals

V[IS] =

1

S(0) · Q

2

· E

Z T 0

X(T ) − X(t)_{2}
dS^{0}

t

. (2.15)

Note that the variance of IS is independent of the functions f and G, but does depend
on the execution strategy X and on the quadratic variation of the unaffected process S^{0}.
The variance of IS being independent of f, G is a result of the additive assumption
in (2.12) and the assumption of f, G being deterministic. Intuitively, this implies that
market impact does not affect IS variance. As intuitively expected, a larger quadratic
variation (i.e. variance) in the unaffected process S^{0} leads to a higher variance of IS.

Also, note that |X(T ) − X(t)| represents the remaining volume that needs to be ex- ecuted at time t and that it is generally larger for execution strategies with a longer weighted average (volume time) duration. The result in (2.15) shows that IS variance is larger for such execution strategies. The intuitive explanation for this observation is that a longer weighted average duration corresponds to a larger exposure to market risk.

2.3.2. Heuristic model

This section considers a simplified heuristic model based on some additional assumptions.

The aim is to present a model in which the distribution of IS can be computed more explicitly. The assumptions lead to a model that is consistent with the Square-Root Law and that proposes a linear dependence between IS variance and volume time duration.

More specifically, explicit assumptions are considered for the execution strategy X,
the unaffected stock price process S^{0} and the propagator model functions f, G.

Heuristic Assumption 1. The execution strategy X is assumed to be the VWAP strat- egy, with a volume time representation as presented in (2.3).

This assumption is motivated by the simplicity of the volume time representation and by the prevalence of VWAP execution strategies in practice (and literature). The heuristic assumption is consistent with other market impact literature (e.g.Zarinelli et al.,2015).

Heuristic Assumption 2. The unaffected stock price process S^{0} is assumed to be a
volume time arithmetic Brownian motion

S^{0}(t) = S(0) +
Z t

0

σ_{D} dW (s), (2.16)

where σD represents the daily stock price volatility^{13} and where W is a Wiener process.

This assumption coincides with earlier market impact literature (e.g.Gatheral & Schied, 2013;Curato et al., 2017) and literature on subordinated stock price processes (Clark, 1973). Note that the assumption of constant volatility in volume time, induces stochastic volatility in chronological clock time: the chronological clock time volatility is then driven by the stochastic process for cumulative traded market volume.

Also, note that the arithmetic Brownian motion model allows for negative stock prices, making the model unrealistic. Gatheral and Schied (2013) argue however, that due to the short execution time intervals, the probability of negative prices can be considered negligible for realistic parameter values.

An additional motivation for the chosen arithmetic Brownian motion, is that the quadratic variation of the stock price process scales with (volume) time. This is crucial for obtaining a heuristic model in which IS variance scales with volume time duration.

Heuristic Assumption 3. The instantaneous market impact f and time decay kernel G are assumed to have the following forms:

f (q) = sign(q) · σD·

q ADV

δ

, G(τ ) = τ^{−γ}, (2.17)

where δ, γ > 0 are model parameters.

The latter assumptions are consistent with earlier literature on propagator models (Zarinelli et al.,2015; Curato et al., 2017). It is shown in (2.19) that this choice of f, G leads to a market impact model that is consistent with the Square-Root Law. The latter ob- servation seems to be the main motivation for this assumption in earlier literature.

Additionally, Gatheral (2010) shows that there exists no price manipulating arbitrage strategy in a propagator model with the assumptions as in (2.16) and (2.17).

Stock price process in heuristic model

Combining the three heuristic assumptions gives gives the following stock price process:

S^{X}(t) = S(0) +
Z t

0

ε · σ_{D}· η^{δ}· (t − s)^{−γ}ds +
Z t

0

σ_{D} dW (s). (2.18)
Note that the dW (s) term is the only stochastic term in the heuristic stock price process.

It therefore follows that the price returns have a Gaussian distribution. In fact, the Gaussian distribution also follows for IS in the heuristic model (see Appendix A.1.2).

13Note that σD represents the stock price volatility in currency units. This notation is to distinguish from σ, which represents volatility as a percentage of the stock price.

Similar to as in the general setting, the expectation and variance of IS can also be
computed in the heuristic model. Full derivations can be found in Appendix (A.1.2)^{14}.
Expected Value of IS in Heuristic Model

The expectation of IS, i.e. market impact, in the heuristic model is shown to equal
E[IS] = α · σ · π^{δ}· T^{1−γ−δ}, (2.19)
where σ represents the volatility as percentage of the stock price.

Note that if the parameters satisfy 1 − γ − δ = 0, then (2.19) reduces to the Power Law form of the Square-Root Law as described in (2.11). Market impact, conditional on relative order size π, is then independent of the volume time duration T .

Variance of IS in Heuristic Model

The variance of IS in the heuristic model is shown to equal V[IS] = 1

3 · σ^{2}· T. (2.20)

Notably, the variance of IS scales with the stock price return variance σ^{2} as well as with
the volume time duration T . This is verified in the empirical analysis in Chapter3.

Note that the IS variance in (2.20) differs from the variance of the unaffected stock
price process S^{0} only by a factor 1/3. This further illustrates that the result depends
directly on the assumption of S^{0} being an arithmetic Brownian motion (in volume time).

More specifically, the scaling property in the heuristic model crucially depends on S^{0}
having a quadratic variation that is proportional to (volume) time. As an illustrative
example, it is shown in Appendix (A.18) that IS variance does not preserve the same
scaling properties if S^{0} is modelled as a geometric Brownian motion.

Note that the scaling properties do not depend on the choice of modelling in continuous- time: see Appendix Section (A.1.3) for an alternative discrete-time derivation.

### 2.4. Conclusion

This chapter addresses sub-question (Q.1); which asks what theoretical framework is appropriate for modelling IS. A continuous-time propagator model is considered, as it allows for explicit modelling of the stock price process in the presence of a meta- order. The often practically motivated convention of modelling in a volume time clock, is motivated further from a theoretical perspective.

The chapter concludes with a heuristic model based on conventional simplifying as- sumptions: a VWAP strategy as order execution strategy, a volume time arithmetic

14Note that a similar result for the variance in the heuristic model is derived in a discrete-time trading setting. The continuous-time result then follows as a limit result as trading frequency increases. This derivation is shown in Appendix (A.1.3).

Brownian motion for the unaffected stock price process and propagator model functions that allow for Square-Root Law market impact. The VWAP strategy is motivated by its prevalence in practice and in the literature, as well as by its simple volume time representation. The arithmetic Brownian motion assumption is motivated by noting the link to subordinated stock price processes proposed by Clark (1973) and by noting the resulting linear dependence between variance and volume time duration.

Similar to earlier literature on propagator models, a derivation of the expectation of implementation shortfall (IS) is provided. It is noted that the latter only depends on the choice of execution strategy and propagator functions and not on the unaffected stock price process. In the heuristic model, the expectation reduces to the Power Law form of the Square-Root Law.

As a contribution to the literature, also the variance of IS is computed and investigated.

It is found to depend both on the execution strategy and on the unaffected stock price
process. In the heuristic model, it is found to scale linearly with stock price return
variance and with volume time duration. It is noted that this property of the heuristic
model crucially depends on the assumption of the arithmetic Brownian motion, but not
on the continuous-time nature of the model. In principle, the general model in Section
2.3.1 allows for non-Gaussian IS, since S^{0} can be any martingale. Note however, that
to preserve the linear dependence between IS variance and volume time, the execution
strategy would then have to be different from a VWAP strategy.

Summarized, the heuristic model proposes that IS ∼ N

α · σ · π^{δ}, c · σ^{2}· T

, (2.21)

where π represents relative order size, σ represents stock volatility (in percentage units), T represents volume time duration and α, δ, c represent positive constants.

The empirical fit of the heuristic model properties is briefly investigated in Chapter3.

Chapter4 then specifically investigates whether other for the expectation of IS improve on the Square-Root Law model. Chapter6investigates whether alternative distribution or variance specifications provide better probabilistic forecasts of IS.

## 3. Data and Empirical Analysis

This chapter provides an introduction to empirical modelling of implementation shortfall (IS) and investigates the empirical fit of the heuristic model proposed in Chapter2.

The first section introduces the available data set by outlining how the important variables are computed, listing and motivating the used data filters and providing some descriptive statistics.

The second section assesses the empirical fit of the heuristic model by investigating the properties of IS as predicted in (2.21). It is identified in what aspects other empirical models could improve on the heuristic model. The latter serves as a motivation for the considered empirical models in later chapters.

### 3.1. Data

This thesis considers a proprietary data set of historical meta-orders executed by asset
manager Robeco in the period from January 2016 up to September 2020. The data
consists of a total number of 209,811 meta-orders, covering 6653 different stocks and
totalling overe172 billion in traded volume^{1}.

3.1.1. Variables

The absolute order size Q of a meta-order is the total number of shares in the meta-
order. The Average Daily Volume (ADV) is computed as the 21-day historical average
of daily traded market volume^{2}. The traded market volume during execution VE of a
meta-order is computed and provided by an external data provider that specializes in
broker trading data. Using these variables, one can compute the relative order size π
(2.6), participation rate η (2.7) and the (volume time) duration T (2.8):

π := Q

ADV, η := Q

V_{E}, T := V_{E}

ADV . (3.1)

As noted in Chapter2, the stock price volatility σ should be represented as a percentage
of the stock price. In this thesis^{3}, σ is computed as the 15-day historical range-based
volatility introduced by Parkinson(1980).

1The statistics are computed after applying the later data filters in Section3.1.2.

2There is a trade-off in deciding the number of days in the historical averages: the most recent obser- vations are more representative of the near future, but using more observations improves robustness against outliers. This research follows the definitions as used in earlier research by Robeco.

3See the MSc thesis by Blonk (2013) for motivation for choosing range-based volatility instead of close-price volatility, based on empirical analysis.

The realized execution price of a meta-order is computed as the volume-weighted average of the execution price of the child orders, excluding taxes and fees. The realized implementation shortfall (IS), as defined in (2.4), is computed with the price S(0) at the start of the meta-order as reference price. More specifically, the reference price equals the next-tick price at the moment of sending the first child order to a broker. During continuous trading hours, the latter just equals the stock price at that moment. During non-trading hours, the next-tick price equals the stock price in the very next tick that the market is open for trading.

The data set also includes stock-specific characteristics, such as: the bid-ask spread as a 5-day historical average, the total market capitalization and characteristics of the market at which the stock is traded. The latter include geographical indicators such as the country and region, the currency, and whether the market is considered an emerging market (EM) or developing market (DM).

Finally, the data set also includes some order-specific characteristics, such as: a buy/sell indicator, the date and time of the meta-order, the fund for which it was executed and the managers/traders involved in the execution.

3.1.2. Filters

Before the empirical data analysis, some of the historical meta-orders are removed from the data set, according to the following filters:

• Block Trade meta-orders: Contrary to regular meta-orders, block trade meta- orders are executed only if there is enough liquidity. That is, if the trader finds a counterparty that is willing to trade for a price that is close to a reference price.

Since these trades only get executed when market impact is low, not removing them from the data set would lead to underestimation of market impact.

• Funds that do not always fully execute meta-orders: For some of the Robeco funds, it is possible that the execution of a meta-order is stopped prematurely.

Since not fully executing a meta-order is likely correlated with market price move- ments, leaving in these funds would introduce bias in measuring market impact (Brokmann et al., 2015). Note that only removing the meta-orders that did not fully execute, instead of the entire funds, would still lead to biased estimation. The removed funds constitute roughly 5% of all meta-orders.

• Incorrect values: Some variables have naturally bounded domains, e.g. trade size and volatility should only take positive values. The handful (0.17%) of observations that violate such domain bounds are removed.

• Too slow executions: Meta-orders for which the volume time duration of exe- cution exceeds 15 days are excluded from the data set. These seem to be outliers from the normal execution procedure and introduce a lot of noise due to the very long exposure to market risk. These constitute less than 0.1% of the observations.

3.1.3. Statistics

Table 3.1 shows some descriptive statistics of the meta-orders, grouped per bucket of
relative size π. The majority of the meta-orders are of small relative size, but the data
set also contains some very large meta-orders^{4}.

The average implementation shortfall is increasing in relative size π, which is consistent with market impact being an increasing function of π. Notably, the standard deviation is also increasing in π. The latter can be (partially) explained by the larger meta- orders having a longer (volume time) duration T and thus having longer exposure to market noise. The standard deviation of the volume duration is also larger for larger meta-orders.

Table 3.1.:Descriptive statistics for different buckets of the relative size π. The first columns represent the share of each relative size bucket in number of meta-orders and total meta-order size respectively. The final columns represent the average implementation shortfall (in basis points) and the volume time duration respectively. The standard deviation is displayed within brackets. The statistics are computed after applying the filters in Section3.1.2.

Relative Size π #Orders Value IS Duration T 0.0% - 2.0% 66.3% 32.8% 5 (118) 0.78 (0.69) 2.0% - 5.0% 12.9% 18.2% 14 (131) 0.97 (0.80) 5.0% - 10.0% 8.3% 15.0% 24 (133) 1.17 (0.96) 10.0% - 25.0% 8.0% 18.7% 33 (141) 1.62 (1.37) 25.0% - 50.0% 3.0% 9.5% 52 (167) 2.47 (1.89) 50.0% - 100.0% 1.1% 4.4% 71 (183) 3.80 (2.52) 100.0% - 300.0% 0.3% 1.3% 78 (274) 5.92 (3.38) All 209,811 171.5 (Bln) 12 (127) 1.00 (1.06)

Distribution of Duration T

The realised (volume time) duration T of a meta-order execution depends on many operational aspects, such as staff working hours and exchange trading hours. Providing a full outline of such dependencies is outside the scope of this research, if not practically impossible. Therefore, this research considers the duration as a random variable that is not observed pre-trade, i.e. not before the start of the meta-order execution.

Section 6.1.4 outlines a novel approach of incorporating information on the histor- ical distribution of duration into density forecasts for IS. For that purpose, historical observations of duration T are divided into different segments based on relative size π, fund-specific execution type and geographical location.

Each meta-order is labelled Type 1 or Type 2 based on the fund-specific overall exe-
cution strategy^{5}. Figure 3.1 displays the empirical distribution of duration T for each

4Note that order size relative to historical ADV, can exceed 100%. This is possible because daily traded volume differs over time and because meta-orders execution may span over several days.

5These fund-specific details are considered proprietary information and have no further relevance for this research.

type. Note that the Type 2 meta-orders are more likely to be executed quickly, whereas the Type 1 meta-orders are more likely to have a slower execution.

Each observation is also labelled with one of the following regions: Europe, America or Asia. Figure 3.2 displays the empirical distribution of duration T for each region.

Note that the differences in the distribution of volume time duration per region can be attributed to operational aspects such as market trading hours.

Figure 3.1.:Empirical distribution of volume time duration T per fund-specific execution type.

Figure 3.2.: Empirical distribution of volume time duration T per geographical region.

### 3.2. Empirical Analysis of Heuristic Model

This section investigates the empirical fit of the heuristic model proposed in Chapter2.

3.2.1. Square-Root Law

Rewriting the Power Law form (2.11) of the Square-Root Law gives I

σ = E IS σ

π

∝ π^{δ}. (3.2)

The rewritten form in equation (3.2) allows for an easier visual inspection of the relation between relative size π and market impact I.

Figure 3.3shows scaled IS averaged per bucket of relative size π and a fitted Square- Root Law model. In particular, the plot seems consistent with earlier empirical findings on the concavity of market impact. From the plot, it also seems that the hypothesized Square-Root Law approximates the relation reasonably well. However, it is not unlikely that some other parametrized concave function also allows for a visual close fit.

0 20 40 60 80 100

Relative size ^{π}

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

ISσ

[Scaled Implementation Shortfall ^{IS}/^{σ}] vs. [Relative Size ^{π}] ^{(ZOOMED)}

Bucket Means Square Root

Figure 3.3.:Visualisation of the relation between implementation shortfall over volatility IS/σ and relative size π, for values of π ranging from 0 to 100%. The average IS over volatility per bucket of relative size π is displayed in red and a fitted Square-Root Law (2.10) is displayed in black.

Figure3.4shows the same relation, but adds the empirical quantiles of scaled IS, which indicate a large amount of noise around the expectation. The market impact relation looks like a relatively flat line when adding the quantiles to the figure.

0 20 40 60 80 100

Relative size ^{π}

−10

−5 0 5 10

ISσ

[Scaled Im lementation Shortfall ^{IS}/^{σ}] vs. [Relative Size ^{π}]

Bucket Means Square Root

q0.25,^{q}0.75
q0.10,^{q}0.90

Figure 3.4.:Visualisation of the relation between implementation shortfall over volatility IS/σ and relative size π, for values of π ranging from 0 to 100%. The average IS over volatility per bucket of relative size π is displayed in red and a fitted Square-Root Law (2.10) is displayed in black. The orange and green lines represent the 25%/75% and 10%/90%

percentiles of IS over volatility respectively.

Figure3.5provides an additional plot in which both axes are in log-scale. This allows for
investigation of the empirical fit of the Square Root Law across all orders of magnitude
of π. The plot also shows a fitted linear and a fitted logarithmic model for comparison^{6}.
The Square-Root Law seemingly provides a reasonable empirical fit for all orders of
magnitude.

10^{−1} 10^{0} 10^{1} 10^{2}

Relative size ^{π}

10^{−2}
10^{−1}
10^{0}

ISσ

[Scaled Im lementation Shortfall ^{IS}/^{σ}] vs. [Relative Size ^{π}] (LOG-LOG)

Bucket Means Square Root Logarithm Linear

Figure 3.5.:Visualisation of the relation between implementation shortfall (IS) over volatility and the relative size π on a log-log scale. In red the average IS over volatility per bucket of π. In blue a fitted Square Root Law model (2.10), in green a fitted logarithmic model (4.2) and in purple a fitted linear model (4.6). All model parameters are estimated using Non-linear Least Squares (see Section4.2.1).

6Note that these models and their estimation are discussed in Chapter4.

3.2.2. Variance Scaling

A notable result of the heuristic model is that it predicts that the variance of IS scales
linearly with both the stock price variance σ^{2} and with the volume time duration T .

The empirical fit of these hypothesized linear relationships are displayed in Figure (3.6) and (3.7) respectively. The linear relationships fit the empirically observed relation quite well, indicating that the heuristic model predictions on the variance of IS are consistent with the observed data.

20 40 60 80 100

σ

0 50 100 150 200 250 300 350

σIS

[IS Standard Deviation ^{σ}IS] vs. [Stock Volatility ^{σ}]

Empirical relation Linear relation

Figure 3.6.:The empirical relation between the standard deviation σIS of IS and stock volatility σ.

0 1 2 3 4 5

Volume Time ^{T}
0

20 40 60 80

[IS
σ^{]}

[Scaled IS Variance ^{}(^{IS}/^{σ})] vs. [Volume Time ^{T}]

Empirical relation Linear relation

Figure 3.7.:The empirical relation between the variance ^{σ}_{σ}^{2}^{IS}_{2} of scaled IS and volume time duration T .

3.2.3. Gaussian Distribution

The final important implication of the heuristic model is that IS follows a Gaussian dis-
tribution. The latter is a direct consequence of the assumption that the unaffected stock
process S^{0} is an arithmetic Brownian motion (in volume time). This section assesses
the modelled Gaussian distribution by investigating the standardized IS residuals of the
heuristic model. The standardized residuals are computed as

IS − ˆα · σ · π^{ˆ}^{δ}
σ√

T , (3.3)

where ˆα, ˆδ are estimated using Non-linear Least Squares (NLS).

Figure 3.8.:The empirical distribution of implementation shortfall residuals obtained from the Power Law form (2.11) of the Square Root Law model, divided by the theoretical standard deviation σk·√

T . A Gaussian distribution fitted to the same sample mean and variance is displayed in respectively.

From Figure (3.8) it seems that the Gaussian distribution is at most a reasonable ap- proximation for the standardized residuals, implying a violation of at least one of the assumptions in the heuristic model.

In particular, the kurtosis of the empirically observed distribution is too large to be Gaussian: the standardized residuals have a sample kurtosis of 6.61, whereas a Gaussian distribution has a kurtosis equal to 3. Accordingly, a Jarque Bera test for Gaussianity is rejected at significance level 1%, which is consistent with the earlier remarks.

### 3.3. Conclusion

This chapter further addresses sub-question (Q.1); which asks what theoretical frame- work is appropriate for modelling IS. In particular, it investigates the empirical fit of the heuristic model proposed in Chapter 2.

More specifically, the implications of the heuristic model described in (2.21) are em-
pirically assessed in this chapter. Most notably, the prediction of IS variance scaling
linearly with stock price variance σ^{2} and volume time duration T seems to be consistent
with empirical findings. Additionally, the Square Root Law model (2.10) for market
impact seems visually plausible, but a different concave function could possible provide
a similar or better empirical fit. Finally, it is found that the empirical distribution of IS
is leptokurtic, i.e. has higher kurtosis than the Gaussian distribution.

The latter indicates that at least one of the assumptions in the heuristic model is incorrect. For example, the assumption of a VWAP execution strategy or the assumption of the (volume time) arithmetic Brownian motion. Nevertheless, the heuristic model is still useful as a simple approximate model for empirical analysis.

Chapter4investigates whether average IS is better modelled and forecasted by models other than the Square-Root Law model. Chapter 6 investigates whether alternative distribution and variance specifications provide better density forecasts than the heuristic model.

## 4. Regression Modelling

This chapter focuses on investigating what model provides the best empirical fit in modelling and forecasting market impact. The market impact I of a meta-order is considered to be the expectation of implementation shortfall (IS), conditional on the meta-order characteristics X:

I(X) = EIS|X. (4.1)

Modelling and forecasting market impact can therefore be considered equivalent to re- gressing and point forecasting IS.

In particular, it is investigated whether any of the considered models significantly improve on the Power Law form2.11 of the Square-Root Law in point forecasting IS.

The considered models include alternative parametric models, non-parametric mod- els and models with additional explanatory variables. The models are systematically assessed and compared based on their point forecasts in a cross-validation (CV) setting.

The parametric models are estimated using (Weighted) Non-linear Least Squares (WNLS) and the associated standard errors are computed using a bootstrap.

### 4.1. Considered models

This section provides an overview of the considered market impact models.

4.1.1. Proposed in Literature

Although the Square Root Law model (2.10) and the Power Law form (2.11) are the most popular in market impact literature, alternative models have also been proposed.

For instance,Zarinelli et al.(2015) instead propose a logarithmic dependence between market impact I and relative order size π:

I(X_{i}| α, β) = α · σ_{i}· log 1 + β · π_{i}. (4.2)
They motivate this by arguing that the logarithmic function has a better empirical fit,
especially for very small or large orders of magnitude of π, i.e. π < 10^{−3} or π > 10^{−1}.

Alternatively, literature in the locally linear order book (LLOB) framework proposes that the market impact function is better approximated by a function that is linear for small π and a square-root for larger π (e.g. Donier et al., 2015; Bucci, Benzaquen, et al., 2019). The mentioned literature seemingly propose no explicit form of this model.

Therefore two forms are proposed here: first, a function with a smooth exponentially

weighted transition^{1} between a linear and a square-root function
I(X_{i}| α, β, γ) = α · σ_{i}· π_{i}· e^{−γ·π}^{i}+ β · σ_{i}·√

π_{i}· 1 − e^{−γ·π}^{i}, γ ≥ 0. (4.3)
Alternatively, a continuous function with a discrete transition between a linear and a
square-root function

I(X_{i}| α, β, π^{∗}) =

(α · σ_{i}·^{√}^{π}^{i}

π^{∗} for π_{i} < π^{∗}
α · σi·√

πi for πi ≥ π^{∗}, π^{∗}≥ 0. (4.4)
Finally, Frazzini et al.(2018) propose an additive model with both a square-root and a
linear part:

I(X_{i}| α, β, γ) = α + β · σ_{i}· π_{i}+ γ · σ_{i}·√

π_{i}. (4.5)

The latter model seems to be motivated mainly by its simplicity and empirical fit^{2}.
4.1.2. Additional Benchmark Models

Additional to the models considered in earlier literature, some other ad-hoc model spec- ifications are also considered. They are considered as simple benchmark models and are proposed here without any additional theoretical or empirical motivation.

The following benchmark models are considered:

▷ Linear model in π

I(X_{i}| α, β) = α + β · σ_{i}· π_{i}. (4.6)

▷ Third-order polynomial in π

I X_{i}| α, ⃗β = α +

3

X

k=1

βk· σ_{i}· π_{i}^{k}. (4.7)

▷ Third-order polynomial in √ π

I X_{i}| α, ⃗β = α +

3

X

k=1

βk· σ_{i}· π

k 2

i . (4.8)

4.1.3. Non-parametric Benchmarks

Additional to the considered parametric models above, two less restrictive non-parametric models are also considered. If these models significantly outperform the parametric models, that might indicate that the collection of considered parametric models is too

1Note that limπ→0e^{−γ·π}= 1 and limπ→∞e^{−γ·π}= 0

2Frazzini et al.(2018) estimate a linear regression model with the mentioned square-root and linear term and some additional explanatory variables.