• No results found

Is more always better? : the effect of betting volume on market efficiency : an emperical research into the sports betting market

N/A
N/A
Protected

Academic year: 2021

Share "Is more always better? : the effect of betting volume on market efficiency : an emperical research into the sports betting market"

Copied!
44
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)
(2)

Statement of Originality

This document is written by student Mats Mackaij who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

In this thesis I try to find evidence for the Efficient Market Hypothesis in the men’s football betting market. The main research question this paper tries to answer is: Are popular markets more efficient than less popular markets? I use a large dataset of 102,888 football events, combining the betting odds of a betting exchange and eight bookmakers, to see if the odds of events with a high betting volume are better predictors of the actual outcome than odds of events with a lower betting volume. Intuition obtained from previous studies form the basis of my main hypothesis: more popular matches, i.e. matches with a higher betting volume, will have odds that are more precise estimators of the actual probability of the outcomes than less popular matches. I use a simple probit model on the result of the bet and the implied probability of the odds as the only independent variable. By dividing the dataset into 5 bins according to betting volume, the effect of betting volume on the predictive power of the odds can be evaluated. The result of the analysis shows that the betting volume in football matches has a clear positive relation on the predictive power of the odds. Furthermore it is found that the betting exchange is more efficient than all bookmakers. A simple betting strategy is proposed based on the findings of this paper, yielding substantial ex ante returns.

(4)

1

Table of contents

Abstract ... 0

1. Introduction ... 2

2. Sports betting ... 4

2.1 Efficient Market Hypothesis ... 4

2.2 Pari-mutuel ... 5

2.3 Bookmaker ... 7

2.4 Betting exchange... 8

2.5 Volume ... 9

3. Hypothesis ... 11

4. Data and Descriptive Statistics... 12

5. Empirical Research ... 15

5.1 Models ... 15

5.2 Assessment of the models ... 16

6. Results ... 21

6.1 Regression results model 1 ... 21

6.2 Second hypothesis ... 26

6.3 Results regression model 2 ... 27

6.4 Simple Betting Strategy ... 28

7. Conclusion and Discussion ... 31

8. Appendix ... 33

(5)

2

1. Introduction

Sports betting is of all times and all cultures. Evidence of early times sports betting is found in North America, where the native population of America bet on foot races and ball games

(Stewart Culin, 1921 in Sauer, 1998). In ancient Rome, sports betting was common practice. The famous racetrack Circus Maximus attracted an estimated 260.000 people, many of which had placed bets on the races (Humphrey, 1986).

In the United States the first gaming laws were introduced after World War II; which made almost all types of sports betting illegal. Nevada and Maryland were the only two states which had any form of legalized gambling, most of which were slot machines. It was not until 1970 that sports betting became legal in some states (“US Gambling Laws”, 2016). In 1992 sports betting was outlawed in almost all states by the Professional and Amateur Sports

Protection Act, leaving only Nevada with all types of legal sports betting. It is however estimated that Americans placed $149 billion in illegal sports bets in 2015 (American Gaming Association, 2016). In Europe, laws about sports betting differ from country to country. In the United

Kingdom, bookmaking has been legal by law since the 1960 Act. Sports betting became legal in Spain in the 1980s. In Germany a bookmaker can apply for a license since 2012, which means that the sports betting market is regulated. Other countries with regulated sports betting markets are the Netherlands, France and Australia. ‘In-game’ bets are illegal in Australia, where only pre match bets are allowed.

When bookmakers noticed the possibilities of the internet just after the turn of the millennium, online bookmakers started websites to attract more customers. In 2012 the size of the regulated global betting market was $58 billion, which accounted to 14% of total global gambling, according to a report from the European Gambling and Betting Association. The report also states that the unregulated market is many times larger. Estimations say that the total sports betting market, including illegal betting, surpass the trillion US dollars (Daily Mail, 2015). According to Statista, the size of online global gambling in 2015 was $41.36 billion. Along with the online bookmakers came a renewed interest for empirical research in the field of sports betting. One important focus of these papers is the efficiency of these markets as described in the efficient market hypothesis by Malkiel and Fama (1970).

In this thesis I try to find evidence for the efficient market hypothesis in the men’s football betting market. My empirical research focuses on the effect of betting volume on the predictive power of odds. The main research question this paper tries to answer is: Are popular markets more efficient than less popular markets? I use a large dataset of 102,888 events, combining the betting odds of a betting exchange and eight bookmakers, to see if the odds of events with a high betting volume are better predictors of the actual outcome than odds of events with a lower betting volume. Insights obtained from previous studies form the basis of

(6)

3 my main hypothesis: more popular matches, i.e. matches with a higher betting volume, will have odds that are more precise estimators of the actual probability of the outcomes than less popular matches. I use a simple probit model with the result of the bet as dependent variable and the implied probability of the odds as the only independent variable. By dividing the dataset into 5 bins according to betting volume, the effect of betting volume on the predictive power of the odds can be evaluated. The models are evaluated based on two types of goodness-of-fit tests, often used in previous literature: Pseudo-R2’s and ROC-based statistics. In addition a new goodness-of-fit test is included in the analyses, called the heat map statistic. The result of the analysis shows that the betting volume has a clear positive relation to the predictive power of the odds. Furthermore it is found that the betting exchange is more efficient than all

bookmakers. A simple betting strategy is proposed exploiting the differences in predictive power of odds in different markets. When applied to my dataset the strategy would have yielded substantial profits of up to 42% over around 7,000 bets.

The implications of these results are twofold. First, if these returns of 16-42% are sustainable, bookmakers will have to change their system to avoid major losses to bettors exploiting the difference in efficiency. A second implication is that bettors should bet more on less popular matches because the pricing in these markets is less efficient, opening up

opportunities for substantial profits.

As for the scientific relevance of this research, an addition to the existing literature on sports betting is made. This paper is unique in combining the betting volume obtained from the betting exchange and applying it to both betting exchange and bookmaker odds by matching events. Studies on the effect of betting volume are sparse and are based on small datasets. My study combines a large dataset with odds from a range of bookmakers to create a stronger research foundation. Furthermore, my research contributes to the evidence of the efficient market hypothesis and adds to research on the superiority of betting exchanges over bookmakers in terms of efficiency. The insights of this paper aim at providing a better understanding of the dynamics of market efficiency.

The rest of this thesis is structured as follows. The following chapter reviews the existing literature on market efficiency, sports betting and the effects of betting volume. Subsequently, I outline the data and discuss the descriptive statistics. Thereafter, the results of the regression models are discussed and the paper closes with the conclusion of the research along with ideas for future research.

(7)

4

2. Sports betting

2.1 Efficient Market Hypothesis

According to the Efficient Market Hypothesis (EMH) of Eugene Fama the following is the definition of an efficient market: “A market in which prices always fully reflect all available information is called efficient” (Malkiel & Fama, 1970). Fama divides his hypothesis in three levels of tests, namely weak form, semi-strong form and strong form tests. In the weak form test, the information available to traders is no more than the historical prices, in the semi-strong form test the question is whether prices adjust to publicly available information (e.g. stock splits, public announcements) while the strong form tests includes monopolistic information, which is information that is only available to a small group of investors. In his paper he concludes that there is strong empirical evidence in the existing literature for weak form as well as semi-strong form efficiency in capital markets. Strong form efficient market models are to be used as

benchmarks to check for deviations, and not as a realistic description of the world, according to Fama. This is based on expected return efficient market models, i.e. the expected profits of a speculator should be zero (Malkiel & Fama, 1970). This makes economic sense, because if there is a stock with positive expected profit, everybody would want to buy that stock, driving up the prices until the positive expected profit is reduced to zero. Fama concludes the following: "the evidence in support of the efficient market's model is extensive and (somewhat uniquely in economics) contradictory evidence is sparse" (Fama, 1970, p. 416). According to Grossman and Stiglitz (1980) a market cannot be perfectly efficient because then there would be no incentive for professionals to search for information to gain only a short advantage. In a meta study from 2003, Malkiel concludes that patterns can appear in stock returns but that these patterns are short-lived. Because no method has yet been found to gain above normal returns, Malkiel argues that the EMH still holds (Malkiel, 2003).

The original paper of the EMH bases the hypothesis on capital markets. Later, other markets, like betting markets, were tested for market efficiency. The link between sports betting and security markets is not new. Smith explains why the move of interest to betting markets is justified: “…institutional forms of betting are of scientific interest for two reasons. They yield behavioral implications for individual decision-making under uncertainty. Furthermore, since these betting schemes give rise to wager markets with equilibrating functions similar to ordinary

security (or commodity) markets, they afford opportunities for the study of market mechanisms under a widened class of contingency and institutional conditions.” (Smith, 1971, p. 242). In a paper from Wayne Snyder, published in 1978, the link between the betting market for horse racing and the security market is made, after which the efficiency of the betting market is tested. The similarities of both markets are large number of participants, perfect information (or

(8)

5 extensive common knowledge) and no entry or exit barriers. These are all conditions for a market with perfect competition, which stems from the general equilibrium theory, often attributed to Leon Walras’ Elements of Pure Economics from 1874. In such a market no firm can make above normal profits because new firms would enter causing the prices to fall. A market with perfect competition is rare but it is found that the stock market has characteristics that are close. Furthermore, both markets contain risk and uncertainty, which makes them interesting to study (Snyder, 1978). One could even say that betting markets are simpler and thus are a better subject for testing efficiency theories because of the fixed end date of bets after which the true value is revealed (Thaler & Ziemba, 1988; Gray & Gray, 1997).

Research about the efficiency in betting markets dates back to as early as 1940’s. Looking at the efficiency of the betting market of horse races, Griffith (1949) finds that the odds are on average a good reflection of actual chances of a horse winning. However, he also finds that there is a systematic overvaluation of long odds and an undervaluation of short odds, a phenomenon later called the favorite/longshot bias (henceforth denoted by FLB). Systematic over- or undervaluation suggests that the market is not completely efficient. In an efficient market, a temporary FLB, when spotted, should be exploited by investors by betting on the undervalued short odds, thereby decreasing these odds until they are no longer undervalued. The testing criteria for efficiency used by Griffith, implied probability of the odds should be equal to the actual probability of the outcome, is used in almost all literature on efficiency in betting markets, including this paper. Odds represent the return on a particular bet. Underlying these odds is an expected probability that the event will occur. This is called the implied probability of the odds. If odds are priced efficiently, the inverse of the odds represent the implied probability. As example: a certain event A has an odd of 2.00, which means that if you bet $10 on event A and event A occurs, you would receive $20 (a profit of $10). If the odds are efficient, the expected value of the bet should be zero. This is the case if event A occurs 1

2.00= 0.5 or 50% of the time.

Whenever the odds are priced in a way that no systematic gain can be made, the market is said to be efficient.

I will now discuss some of the results of the comprehensive literature on market

efficiency in sports betting markets. The literature can be roughly divided into three categories: pari-mutuel betting, odds or point spread betting with bookmakers and betting exchanges.

2.2 Pari-mutuel

The early quantitative research into betting markets focused on horse races, which usually employ a pari-mutuel betting system. In this system all bets of a certain kind, for example which horse finishes first, are collected in a pool. After the racetrack has taken a fixed percentage of the pool called the ‘track take’, the remainder is divided among the winners. This means that the

(9)

6 odds of a certain bet are not known until the betting closes and the size of the pool has been determined. To analyze this type of betting, the proportion of money on a certain horse is interpreted as the implied probability of all the bets. Analyzing over 50.000 horse races, Thaler and Ziemba find that the weak form efficiency condition is violated because bets on ‘extreme favorites’ (odds less than 1.30, which implies a probability of 76,9%) have positive expected values. Furthermore they find that the expected return falls significantly for odds above 19.00 (5% implied probability) to a return of 13.7 cents per dollar for odds above 100.00 (1% implied probability). Even though the results show some inefficiency, the authors conclude that the race track betting market is surprisingly efficient, meaning that the odds are good estimates of the probability of winning (Thaler & Ziemba, 1988).

Another way to test for efficiency is to see if all bets have the same expected return. When comparing two types of bets, namely ‘daily double’ and ‘corresponding parlay’, no

difference in expected return was found (Ali, 1979). Also, no significant difference was found in terms of information available to bettors. Outsiders (who have access to publicly available information) collectively have just as good information as insiders (who have monopolistic access to additional information) collectively (Dowie, 1976). Figlewski concludes that

information contained in the predictions of professional handicappers is included in the odds efficiently in the horse racing betting market, which is evidence for efficiency (Figlewski, 1979). On the other hand there is also literature that suggests that these pari-mutuel betting markets are inefficient. The expected return is not the same for all bets and there is a clear bias. However this bias is not large enough to overcome the loss incurred by the track takes of almost 20 percent, so it is not possible to make a systematic profit (Snyder, 1978) in these markets.

Hausch, Ziemba and Rubinstein (1981) come up with a system that produces substantial profits, suggesting that this particular betting market is not (weak form) efficient. In agreement with this inefficiency Tuckwell (1983) finds that betting on the starting odds range 2.75 – 5.00 yields a profit of more than 5 percent. Asch, Malkiel and Quandt (1984) find no positive profit

opportunities in betting on the favorites yet they do find positive profit opportunities in show bets (top three finish) or place bets (top two finish). In greyhound races above–normal profits can be achieved by using ex-ante bettor handicapping data, thereby violating market efficiency (Goodwin & Corral, 1996).

The main conclusion from the literature on pari-mutuel betting is that the markets are relatively efficient but that in some cases small positive profits can be made, violating this efficiency.

(10)

7

2.3 Bookmaker

Apart from gambling on legalized horse or greyhound racing, other types of sports betting are illegal in the United States of America with the exception of four states. In contrast, bookmaking is legal in the United Kingdom. In most other European countries, betting is regulated.

In some sports, mostly where a lot of points are scored like basketball or American Football and sometimes football, an often found type of bet is a point spread: “- that is, a number of points by which one team is viewed as being stronger than the other. You can then place a bet on either team. If you bet on the stronger team, you give away the point” (Vergin & Scriabin, 1978). After analyzing 6 years of National Football League (NFL) point spread data the authors find multiple profitable betting strategies. Similarly an article by Zuber, Gandar and Bowers (1985) on by Gray and Gray (1997) expose inefficiencies in the NFL betting market.

As mentioned before, Griffith (1949) concludes the existence of systematic overvaluation of longshots in horse race betting. This FLB is a reoccurring finding in all kinds of sports betting markets. Mark Rubinstein even finds evidence for a FLB in the option market where short-maturity out-of-the-money calls are priced relatively higher than other calls (Rubinstein, 1985).

Research about market efficiency and the FLB was done in other sports as well, for example the baseball betting market. This market differs from the horse racing market in a few aspects. First of all, an individual is able to lay (sell) or back (buy) a bet in the baseball betting market whereas on a racetrack you can only back a bet and the track bookmakers lay bets. Additionally, there is more certainty about payoffs because the baseball market used in the research employs odds that are fixed after a bet has been agreed upon. Less uncertainty means that bettors can make better decisions, which should lead to a higher efficiency. Furthermore, Woodland and Woodland perceive racetrack gambling as a more pleasure-oriented activity, while baseball gambling is done with the goal of making money. Last, commissions are lower in baseball, which allows for professional gambling and thus presumably more efficient betting (Woodland & Woodland, 1994).

Surprisingly, Woodland and Woodland (1994) find a reverse FLB in their data from the Major League Baseball, meaning that favorites are over-bet leading to a loss on average.

For the football (soccer) betting market a model based on past match results is suggested which is found to be able to generate a profit of 8%, implying that not all historical information is fully reflected in the bookmaker’s odds (Goddard & Asimakopoulos, 2004). The home advantage and team strength are incorporated efficiently in the UK football market (Graham & Scott, 2008), but the FLB is also observed in this market (Cain, Law & Peel, 2000).

Next, there is a comprehensive set of literature on more recent bookmakers who operate mostly via the internet. These papers focus on all kinds of sports like tennis, American football, basketball and football. An advantage of these bookmakers is their size. While the amount of

(11)

8 bets is not made public, William Hill, one of the biggest online bookmakers is worth $5.17 billion and had a revenue of $1.59 billion in 2015 (William Hill PLC, 2015). Due to the size, theoretically such large markets should come closer to an efficient market as compared to smaller markets like a racetrack. Two disadvantages when researching bookmakers are the relatively high ‘take’ as well as the interference in the formation of the odds. Bookmakers are better at predicting the outcome of a bet than a typical bettor. Instead of just trying to balance the amount bet on a win and a loss and skimming off a percentage as profit, bookmakers use their predictive power to exploit the bettors. In the process of doing so, they distort the equilibrium where supply and demand meet. A condition for this to work is that the bookmaker has to be better than most of the bettors or limit the more informed bettors, which seems to be the case (Levitt, 2004). This interference makes for more noise when analyzing this type of markets.

2.4 Betting exchange

One of the most traditional forms of betting is person to person betting where one person says some event will happen and wants to bet that it does while the other person doesn’t believe the event will happen and takes on the bet. Recently this old type of betting system was

rediscovered. This system, called a betting exchange is a marketplace (website) which consists of person to person betting on a large scale. Founded in 2000, the first and most popular betting exchange on the internet is Betfair.com with 1.7 million active users and a revenue of $477 million in 2015 (Betfair Group PLC, 2015). The main differences between a traditional

bookmaker and a betting exchange like Betfair are the possibility to lay bets and the extent of interference in the formation of the odds. On a betting exchange bettors can take the role of bookmaker by specifying the odds they are willing to pay if a specific event happens along with a maximum amount they are willing to risk. For example assume player A lays a bet with the odds of 2.00 saying Real Madrid will win their next match for a maximum of $100. If player B thinks Real Madrid is going to win he can back the bet after which player A and B have a deal. In the case that Real Madrid wins, player B gains his amount wagered times 2.00 and in the case Real Madrid loses, player A receives the amount wagered by player B. A bet exchange therefore has an additional similarity to a stock market where you also buy and sell assets. The odds on a betting exchange are fully determined by demand and supply without interference of a bookmaker, making it a better-suited market for efficiency testing than the markets discussed above. Betfair does subtract a percentage of every winning bet (2-5%) but this does not affect the odds ratio.

Because, as Levitt (2004) concludes, bookmakers have superior knowledge about the probabilities of an outcome and use this knowledge to make profits, conclusions drawn from these odds could be biased. This should not be the case for a betting exchange where there is no interference of bookmakers. The market of a betting exchange is therefore similar to the stock

(12)

9 market and thus even more interesting to look at when investigating market efficiency. When comparing matched data on horse racing from Betfair and a traditional bookmaker, Smith et al. find that Betfair data show significantly lower biases making it a more efficient system (Smith, Paton, Vaughan & Williams, 2006). The authors attribute this higher efficiency to lower

transaction costs and thus suggest that transaction cost are partly responsible for the FLB. This model is called a cost-based model of FLB, suggested by Hurley and McDonough (1995). A study of the men’s professional singles tennis market is unable to find evidence of a FLB or any other market efficiencies, thereby supporting the cost-based explanation of the FLB (Valtonen, 2013). It is even possible to think of a simple betting strategy of selecting bets that have higher odds with a bookmaker compared to Betfair to generate a small profit, demonstrating the superior efficiency of the betting exchange (Franck, Verbeek & Nüesch, 2010).

It has been shown that in-game odds of football matches on Betfair are almost

immediately adjusted after news pertaining goals (or the lack there of) comes in, demonstrating the high level of efficiency and speed of incorporating new information (Croxson & Reade, 2014). On the other hand this efficient incorporation of information is not always observed. Evidence from analyses of the Ryder Cup Golf Competition suggests that prices underreact to news events, similar to other financial markets (Docherty & Easton, 2012). The difference between the last to findings could be due to differences in the type of information or news events caused by differences between the sports studied, or for example the betting volume, which is lower for golf events.

2.5 Volume

As mentioned above, one of the essential characteristics of a market with perfect competition (which tends to yield efficiency) is a ‘sufficiently’ large number of participants. Unfortunately, there is no uniformly applicable definition on what constitutes a ‘sufficiently large number’ that one could use to test for this characteristic. It is however possible to compare markets with ‘many’ participants to markets with fewer participants and see whether, ceteris paribus, heavily populated markets are more efficient than thin markets. For pari-mutuel horse racing, this has indeed been found to be the case. Races with a high betting volume have a significantly greater proportion of market efficient bets (Walls & Busche, 1996). This study was later reproduced with a substantially larger dataset yielding the same results, i.e. less inefficiency in races with high bet volumes (Busche & Walls, 2000). Other studies show that the FLB is negatively

correlated with betting volume (Gramm & Owens, 2005; Gramm & Owens, 2006). Furthermore, Smith et al. (2006) show that in both betting exchange and traditional betting market, higher betting turnover reduces all biases. In a comparison between the highly popular English football league and the less popular Swedish football league it was found that both markets have a high level of efficiency on average; however the variation in the English football league was

(13)

10 significantly lower, providing evidence that markets with higher information flow and betting volume are more efficient (Anners & Saarm, 2015).

(14)

11

3. Hypothesis

Based on the literature review and insights gained from the stock market I formulate

expectations for the empirical analysis. Betting markets are found to be efficient on average with an exception of a few minor inefficiencies. On average, the odds are good predictors of the actual probability of an outcome. This hypothesis is based on literature covering sports, for no

argument has been found that would suggest substantial differences between betting markets of different sports. To investigate if more popular markets are more efficient I propose two

hypotheses that will be subjected to a range of efficiency tests.

Hypothesis 1: More popular matches, i.e. matches with a higher betting volume, will

have odds that are more precise estimators of the actual probability of the outcomes than matches with a lower betting volume. This hypothesis is based on the literature on betting volume where the consensus is that a higher betting volume reduces biases and thus inefficiencies. Because Betfair exhibits higher efficiency than traditional bookmakers, or according to some research even has no inefficiencies (Valtonen, 2013), it could be that both crowded and thin markets in Betfair are efficient, in the sense that the odds are on average good predictors of the actual outcome. However, as shown by Anners and Saarm, (2015) there still could be a difference in efficiency in the form of variance. Traditional betting markets are more prone to biases so the expectation is that the difference in efficiency betting volume will be more clearly found there.

Hypothesis 2:The market of the most popular events is less efficient than the market of

slightly less popular events. In other words, the odds of bets with the absolute highest betting volume are worse predictors of the actual outcome than odds of bets with a slightly less high betting volume. Inexperienced and uninformed bettors tend to bet on the most popular matches, for example the top 1% in terms of betting volume because these are bets involving the most popular and most famous teams. Bets just below this group of most popular matches, for

example the 99th percentile, do have the efficiency gains from the betting volume but do not have the efficiency loss due to inexperienced and uninformed bettors.

(15)

12

4. Data and Descriptive Statistics

For data on bookmaker odds I use a dataset from football-data.co.uk containing historical results and odds of 22 leagues from 11 countries on professional men’s football (soccer). The dataset contains all competition matches played between January 2006 and December 2014, totaling 64.909 matches. The data were extracted directly from 8 popular online bookmakers (based in): Bet365 (U.K.), Bet&Win/Bwin (Austria), Interwetten (Malta), Ladbrokes (U.K.), Sportingbet (U.K.), William Hill (U.K.), Stan James (U.K.), and VC bet/Betvictor (Gibraltar). Information on betting volume is not made public by online bookmakers.

For data on betting exchange odds I use a dataset from the first and largest online betting exchange Betfair. The dataset containing historical results and odds and betting volume is available on betfair.com. The dataset contains almost all football matches played in any

professional division from January 2006 until December 2014. Although Betfair also has data on ‘in-play’ bets I only use bets that were placed before a match started to analyze the predictive power of the pre-game odds.

To answer the research question information on betting volume is needed, which is available only in the Betfair dataset. By combining both datasets at the match level I am able to research betting volume effects on both Betfair and traditional bookmakers. Therefore, in this paper I use only the matches that are represented in both datasets, leaving 102.888 bets, where a bet is a home, draw, or away bet. This assumes that the relative volumes of betting are similar on the betting exchange and the bookmakers, meaning that popular matches are popular on both platforms. In line with existing literature (Kopriva, 2015), events with fewer than 20 bets are excluded from the analysis due to the lack of liquidity. Betfair odds turn out to be very similar to bookmakers odds which can be seen by the high Pearson correlation levels between the implied probabilities (0,983 – 0,989) depending on which bookmaker Betfair is compared to.

The bets are divided into 5 bins based on the betting volume to create bins of

approximately the same size. The bins are numbered 1 through 5 where bin 1 contains the bets with the lowest betting volume and bin 5 contains the bets with the highest betting volume (see table 2). The relevant variables in the dataset are listed below in table 1.

(16)

13 Table 1: Variable description

Table 2: Summary bins betting volume

Variable Description

Betfair Information based on betting exchange 'Betfair'

B365 Information based on bookmaker 'Bet365'

BW Information based on bookmaker 'Bet&Win/Bwin'

IW Information based on bookmaker 'Interwetten'

LB Information based on bookmaker 'Ladbrokes'

SB Information based on bookmaker 'Sportingbet'

WH Information based on bookmaker 'William Hill'

SJ Information based on bookmaker 'Stan James'

VC Information based on bookmaker 'VC bet/Betvictor'

BM Implied probability bookmaker

BF/BM Ratio implied probability Betfair to implied probability bookmaker

Result Result of the bet, win (1) or loss (2)

Bin 'x' Betting volume bin 'x', low number 'x' = low betting volume, high number 'x' = high betting volume

The dependent variable in my analysis is result, which has the value 0 if the bet did not win and a value of 1 if the bet did win. The Betfair dataset contains separate entries for all odds matched for a certain event, which means one event can have up to 100 different odds with matched bets. To make the data manageable I use the weighted average (by volume matched) of all the odds on a single event, in accordance with the existing literature (Kopriva, 2015; Franck et al., 2010; Anners & Saarm, 2015). In this way I have one odds per event. The betting volume represents the total amount in USD bet on that certain outcome, i.e. the sum of the volume matched of all bets on that outcome. The total amount bet varied immensely in the dataset, with a minimum of $48 comprising of bets on a draw between Dumbarton and Falkirk in 2013 and a maximum of $14.380.088 comprising of 22.272 bets on a home win in the 2009 ‘El Classico’ FC Barcelona – Real Madrid.

Both the Betfair odds and the bookmaker odds are displayed as decimal odds, 𝑜𝑖, which represent the payout ratio if the bet is won. By taking the inverse of the odds an implied

probability is obtained (see introduction for an example). Because of the ‘overround’ or ‘take’ of the bookmakers and Betfair the implied probability of a home win, draw and away win sum up to more than one. While some papers still use this method to determine the implied probability (Kopriva, 2015; Abinzano, Muga & Santamaria, 2016), others normalize the odds so the implied

Data: All Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Min (volume) $ 48 $ 48 $ 5,887 $ 13,640 $ 30,840 $ 90,040 Max (volume) $ 14,380,000 $ 5,887 $ 13,640 $ 30,840 $ 90,040 $ 14,380,000 Mean (volume) $ 114,900 $ 3,342 $ 9,164 $ 20,910 $ 52,930 $ 488,000 Mean (bets) 829 187 346 531 866 2214 Observations 102,888 20,578 20,578 20,577 20,578 20,577

(17)

14 Table 3: Actual proportion Home, Draw, and Away bets and average implied probability Betfair and bookmakers probability of the three possible events sum up to 1 (Franck, Verbeek & Nüesch, 2010; 2013). Strumbelj (2014) concludes that inverse odds cannot be directly interpreted as probabilities, which is why I will use normalized odds using the following formula from Franck et al. (2010): 𝐼𝑚𝑝𝑙𝑖𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑒 =

1 𝑜𝑒∗∑ (1

𝑜𝑒) 𝑒

where 𝑒 ∈ {ℎ, 𝑑, 𝑎} representing home win, draw and away win.

To get a first idea about the efficiency of the betting markets I compare the average probabilities implied by the odds with the actual proportion of home wins, draws, and away wins. As can be seen in table 3, average Betfair implied probabilities are closest to the actual proportion from all odds in the dataset in all three cases, which is in agreement with the earlier literature discussed in the literature review. Interestingly, all markets seem to underestimate the occurrence of a home win and overestimate the occurrence of an away win and a draw.

Data: Actual Betfair B365 BW IW LB SB WH SJ VC

Home bet 0.4529 0.4510 0.4453 0.4426 0.438 0.4421 0.4448 0.4395 0.4446 0.4457 (0.1635) (0.1516) (0.1484) (0.1390) (0.1441) (0.1508) (0.1454) (0.1493) (0.1522) Draw bet 0.2641 0.2651 0.2674 0.268 0.2697 0.2678 0.267 0.2714 0.2683 0.2673 (0.0466) (0.0397) (0.0393) (0.0349) (0.0358) (0.0397) (0.0378) (0.0386) (0.0403) Away bet 0.2831 0.2840 0.2873 0.2894 0.2923 0.2901 0.2883 0.2892 0.2871 0.287 (0.1443) (0.1353) (0.1326) (0.1235) (0.1292) (0.1342) (0.1296) (0.1331) (0.1355)

(18)

15

5. Empirical Research

5.1 Models

In this section I explain the empirical model that I use to test the predictive power of the odds to see if the efficiency varies with betting volume. The dependent variable of the model is result and the independent variable is the implied probability of the odds. Because the dependent variable is binary, the statistical model used is a probit regression1. Furthermore to test if Betfair odds contain additional information over bookmaker odds I use a probit regression with result as dependent variable and the implied probability of both the bookmaker and Betfair as

independent variables. I choose to focus on the betting volume without differentiating between the type of bet (home, away, and draw) as seen in Franck et al. (2010) because of the scope of this thesis.

Model 1: Probit regression on result (Efficiency)

I denote 𝑦 as result, i.e. { 1 𝑖𝑓 𝑏𝑒𝑡 𝑖𝑠 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 0 𝑖𝑓 𝑏𝑒𝑡 𝑖𝑠 𝑛𝑜𝑡 𝑐𝑜𝑟𝑟𝑒𝑐𝑡

I denote 𝑃𝑟𝑜𝑏𝑖 as implied probability of the odds where 𝑖 = 1 𝑡𝑜 9 representing odds from

Betfair and eight bookmakers. Further I denote the conditional probability to observe 𝑦 = 1 given the independent variable by: Pr (𝑦𝑖= 1| 𝑃𝑟𝑜𝑏𝑖). The model would look like this:

Pr (𝑦𝑖 = 1| 𝑃𝑟𝑜𝑏𝑖) = 𝛽0+ 𝛽1𝑃𝑟𝑜𝑏𝑖+ 𝜀

Since the dependent variable is binary and the independent variable is not, I use a probit transformation: 𝑝𝑟𝑜𝑏𝑖𝑡 𝑌 = 𝛷−1(𝑌), where 𝛷 is the cumulative normal distribution. The

regression model will look as follows:

𝑝𝑟𝑜𝑏𝑖𝑡(Pr (𝑦𝑖= 1| 𝑃𝑟𝑜𝑏𝑖)) = 𝛽0+ 𝛽1𝑃𝑟𝑜𝑏𝑖+ 𝜀

The transformation of the regression model changes the interpretation of the model as well. Before the transformation, the right side predicts the change in 𝑦, but for a binary variable this is difficult to interpret. The probit regression model has a different interpretation where the right side predicts the change in the z-score of 𝑦. To create understandable results, the marginal effect is used for the coefficients. The marginal effect coefficients can be interpreted as the change in probability of 𝑦 = 1 as consequence of a change in the implied probability. Because the probit function is non-linear, the coefficients are the most precise estimation with small increases of the independent variable. I will run nine regressions, each with the dependent variable 𝑦 and one independent variable.

1 Alternatively a logit or linear probability model could be used. Franck et al. (2010) however found no

(19)

16

Model 2: Probit regression on result (Additional information)

I denote 𝑦𝑖 as success, i.e. { 1 𝑖𝑓 𝑏𝑒𝑡 𝑖𝑠 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 0 𝑖𝑓 𝑏𝑒𝑡 𝑖𝑠 𝑛𝑜𝑡 𝑐𝑜𝑟𝑟𝑒𝑐𝑡

I denote 𝐵𝐹 as the implied probability of the odds from Betfair, and 𝐵𝑀𝑖 as the implied

probability of the odds of bookmaker 𝑖 where 𝑖 = 2 𝑡𝑜 9 represent the eight bookmakers. I denote 𝑅𝑎𝑡𝑖𝑜𝑖 = 𝐵𝐹

𝐵𝑀𝑖 as the ratio

2 of implied probability of the odds of Betfair over a bookmaker 𝑖. A probit transformation is applied. The regression model is as follows:

𝑝𝑟𝑜𝑏𝑖𝑡(Pr (𝑦𝑖 = 1| 𝐵𝑀𝑖, 𝑅𝑎𝑡𝑖𝑜𝑖)) = 𝛽0+ 𝛽1𝐵𝑀𝑖 + 𝛽2𝑅𝑎𝑡𝑖𝑜𝑖+ 𝜀

To create understandable results, the marginal effect is used for the coefficients. I will run eight regression, each with the dependent variable 𝑦, and two independent variables: implied

probability of bookmaker 𝑖 and the corresponding 𝑅𝑎𝑡𝑖𝑜𝑖.

5.2 Assessment of the models

There is no clear cut method to assess the efficiency or predictive power of odds. Various

methods have been used in the literature. The easiest method, commonly used in early literature (e.g. Ali, 1979; Thaler & Ziemba, 1988) places a fictive bet on all outcomes and try to find a type of bet with a positive return. This gives an idea about the efficiency of the market because in an efficient market every bet should have the same expected return. As mentioned before, a frequently found result in this type of research is that betting markets are not completely efficient and often suffer from FLB. If one wants to say something about the predictive power of odds, one needs to create statistical models, which can be assessed by their quality with the use of various measures. To answer my research question a range of tests is performed which can be divided into three categories: Pseudo-R2, Classification-based approaches, and the relatively new heat map approach.

Pseudo-R

2

The coefficient of determination or R2 is a statistic used in linear regression models and serves as a goodness-of-fit estimate. It is a value between 0 and 1, measuring how well the model approximates the data points where 1 indicates a perfect fit. It can also be interpreted as the improvement from a model without independent variables to the model used or as the

correlation between the predicted values and the actual values. When the dependent variable is binary however, the R2-statistic is not applicable because the model estimates of a probit model are maximum likelihood estimates. Because probit estimates are not calculated to minimize variance, R2 cannot be used as a measure for goodness of fit. Alternative measures of goodness-of-fit have been developed, one of which is the ‘pseudo- R2’. Such measures do not have the same

2 Another possibility could be including BF and BM in the same probit model. A possible problem however

(20)

17 interpretation as the regular R2-statistic, they are merely called this way because they look similar (UCLA, 2016). Unlike R2, there are multiple pseudo-R2’s with no consensus on which one is the best, which is why I will use two pseudo-R2s. According to Sung, McDonald and Johnson (2016), pseudo-R2’s are “the most fundamentally important tool for assessing the performance of discrete choice models”. Furthermore, Benter (1994) stresses the power of pseudo-R2’s in comparing the efficiency of predictive models.

Conclusions on which model is better are often based on which model has a higher pseudo-R2 (e.g. Franck et al., 2010), without checking if the difference in pseudo-R2 is significant (Press & Zellner, 1978). Without significance levels, conclusions based on pseudo-R2’s could be skewed, because the higher pseudo-R2 could have a higher standard error for example. To check for significance, the variance of the statistic is needed for which the underlying distribution of the statistic has to be known. These requirements are probably why not many papers state the variance of pseudo-R2’s, for the underlying distribution depends on a lot of factors and is often complex. Sung et al. (2016) describes a method3 to estimate the underlying distribution of two pseudo-R2’s in particular (McFadden’s and Maddala’s pseudo-R2), which allows for the

calculation of the variance of these statistics. The variance can in turn be used to test whether distinct pseudo-R2’s of a model are significantly different.

McFadden’s pseudo-R2 is recommended in many econometric textbooks (Dhyrmes, 1986 p.1585; Judge, Griffiths, Hill, Lütkepohl, & Lee, 1985 p.767) and is also often used in the

literature on sports betting efficiency (Schnytzer & Shilony, 1995; Sung, Johnson, Dror, 2009; Franck et al., 2010; McCannon, 2013). The formula proposed by McFadden (1973) is as follows:

𝑅𝑀𝑐𝐹𝑎𝑑𝑑𝑒𝑛2 = 1 −

ln 𝐿(𝑀𝑓𝑢𝑙𝑙)

ln 𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡)

where 𝐿(𝑀𝑓𝑢𝑙𝑙) is the estimated likelihood function of the model and 𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡) is the

estimated likelihood function of an empty model with only an intercept. This pseudo-R2 is always between 0 and 1 where 0 means no explanatory power and 1 means the model is a perfect predictor.

Maddala’s pseudo-R2, also used under the name Cox-Snell pseudo-R2 (1989) or Cragg and Uhler pseudo-R2 (1970) is discussed and recommended in a goodness-of-fit comparison paper (Veall & Zimmermann, 1996) and used in other economic research (e.g. Okunade, Berl, 1997; Lowe & Parvar, 2004). The formula proposed by Maddala (1986) is as follows:

𝑅𝑀𝑎𝑑𝑑𝑎𝑙𝑎2 = 1 − (𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡) 𝐿(𝑀𝑓𝑢𝑙𝑙)

)

2/𝑛

3 The derivation and mathematical proof are beyond the scope of this paper. For more details about the

(21)

18 where 𝑛 is the number of observations. This pseudo-R2 has an upper bound that is dependent on 𝑛, because if the model fully predicts the outcome, i.e. 𝐿(𝑀𝑓𝑢𝑙𝑙) = 1, the r2 is 1 −

(𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡) 2/𝑛

which is smaller than 1.

For conducting a significance test on a statistic, the underlying distribution of that statistic has to be known. Sometimes it is possible to estimate the distribution based on information about the way the sample is obtained. However in the case of the pseudo-R2 of a model there is no distribution, because only one value for the pseudo-R2 is obtained per

regression. There are several methods to estimate the distribution of a statistic, even if only one value is given. One of these methods is estimating the distribution using a bootstrap approach, proposed by Efron (1979). Bootstrapping uses random sampling with replacement to estimate the distribution of a statistic. In the case of the pseudo-R2, a randomly chosen part of the sample is replaced by values from an independent and identically distributed population. The bootstrap is repeated many times to create a range of pseudo-R2’s. In this paper, the bootstrap is repeated 1000 times, in line with Ohtani (2000). Because the replaced values are drawn from an

independent and identically distributed population, the distribution of the pseudo-R2’s is assumed to be normally distributed. Now that the distribution of the pseudo-R2 is known, a statistical significance test for difference in two pseudo-R2’s can be performed. The standard normal test statistic which tests the alternative hypothesis that the two pseudo-R2’s are

different, i.e. the probabilities from one model are more accurate than the probabilities from the other model, against the null hypothesis that the two pseudo-R2’s are not different. The formula of the Z-statistic is as follows:

𝑧[𝜇(𝑅2)] = [𝜇(𝑅1 2) − 𝜇(𝑅 22)] √𝑠2(𝑅 12) + 𝑠2(𝑅22) where 𝜇(𝑅12) and 𝜇(𝑅

22) are the sample means and 𝑠2(𝑅12) and 𝑠2(𝑅22) are the sample variances

of the pseudo-R2’s obtained via the bootstrap method.

Classification-based approach (ROC)

Classification-based goodness-of-fit tests check how well the model is in separating data points in either the bin where a certain event occurs (y=1) or in the bin where it doesn’t (y=0). The measure of the predictive power is the percentage of correct predictions. A threshold is

determined between 0 and 1, let’s say 0.5 and all probabilities above 0.5 would be classified as a win (y=1), while all probabilities below 0.5 would be classified as a loss (y=0). By comparing the number of correct predictions, the prediction accuracy can be measured. This statistic is

depends on the cutoff value and is usually not very meaningful on its own. By plotting a true positive rate, i.e. the number of correctly identified data points divided by all data points, for

(22)

19 Figure 1: Example ROC-curve

every probability a more usable statistic can be inferred. This graphical plot is called a receiver operating characteristic (ROC), of which an example can be seen in figure 1 below.

If the model could perfectly discriminate the ROC-curve would pass through the point (0,1) on the graph, the red line in figure 1, while a model with no predictive power is

represented by a straight 45° line from bottom left to upper right, the black line in figure 1. The better the model is, the more the curve will go towards the upper left, making the area under the ROC-curve larger. By calculating the area under the ROC-curve, two models can be compared. A larger area under the curve (AUC) corresponds with a higher true positive rate. More

specifically, an area under the curve of 1 means that all predictions are correctly sorted in the right bin while an area of 0.5 implies randomly classified events (Hanley, McNeil, 1982). To ascertain that the difference in AUC is statistically significant two AUC’s can be compared using a method described by DeLong, DeLong & Clarke-Pearson (1988). The ROC-statistic is widely used in the sports betting literature (e.g. Mori & Hisakado, (2010); Franck et al. (2010); Nagar & Malone, 2011). Note however that these papers lack a significance check. Esaray and Pierce (2012) note that a classification-based approach merely tries to separate cases where y=0 from y=1 instead of assessing the quality of the probabilities as prediction and therefore should be used in combination with other statistics.

Heat Map

In their paper, Esaray and Pierce (2012) compare goodness-of-fit tests and conclude that all methods have their strengths and weaknesses. They propose a new type test to measure the fit

(23)

20 of a binary dependent variable model which they call ‘the heat map’4. “Our basic plan is to

compare a model's predicted probability, or 𝑝̂, to an in-sample empirical estimate of 𝑃𝑟 (𝑦 = 1|𝑝̂) or 𝑅(𝑝̂). If 𝑃𝑟(𝑦 = 1|𝑝̂ = 𝑚) ≈ 𝑚 for every 𝑝̂ predicted by the model, then the model is a good fit.” (Esaray & Pierce, 2012). So the idea is to check for an probability estimate (implied probability in my case), say 𝑝̂ = 0.3, whether on average 30% of the bets with this probability end up in a win (y=1). By doing this for all possible values for 𝑝̂, after nonparametric smoothing to

overcome the problem of sparseness of data for some values, a heat map plot and corresponding heat map statistic can be calculated which gives a way to assess the fit of the model. The heat map statistic measures what percentage of the values deviates significantly from the perfect line, where a lower percentage means fewer deviations and thus a better model.

Brier Score

A last often used measure of the accuracy of probabilities used to make predictions (e.g. Boulier & Stekler, 2003; Forrest, Goddard & Simmons, 2005; Franck et al. 2010) is called the Brier score. The Brier score can be seen as the mean squared error (MSE) of the forecast as can be seen from the formula. The formula proposed by Brier (1950) is as follows:

𝐵𝑟𝑖𝑒𝑟 𝑆𝑐𝑜𝑟𝑒 = 1

𝑁∗ ∑(𝑃𝑟𝑜𝑏𝑖−𝑦𝑖)

2 𝑁

𝑖=1

where 𝑃𝑟𝑜𝑏𝑖 is the forecasted probability (implied probability of the odds) and 𝑦𝑖 is the result. A

Brier Score can range from 0 to 1 where a lower score, i.e. a lower MSE of the forecast, means a better model. The Brier score however is not applicable to compare a model across different datasets, only to compare different models on the same dataset. That is why I will only use the Brier score, together with the regression model 2, to see if the Betfair market is more efficient than the bookmaker markets.

4 The derivation and mathematical proof are beyond the scope of this paper. For more details about the

(24)

21 Table 4: Regression results model 1 of Betfair and randomly selected bookmaker

6. Results

In table 4, the regression results of model 1 are displayed for each betting volume bin separately. The results based on the Betfair odds, along with the results based on the odds of one randomly selected bookmaker (IW) is included in the table. The results of the other seven bookmakers can be found in appendix 1. In all regressions I reject the hypothesis of homoscedasticity so robust standard errors are used.

Betfair Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Prob 1.298*** 1.215*** 1.079*** 1.103*** 1.158*** (0.035) (0.031) (0.027) (0.024) (0.021) McFadden's R2 0.06397 0.06768 0.06390 0.08430 0.10616 Maddala's R2 0.06991 0.07667 0.07487 0.10364 0.13678 AUC (ROC) 0.67395 0.67732 0.66824 0.69236 0.71265

Heat map statistic 80.076% 82.992% 66.589% 79.498% 16.635%

Observations 20,578 20,578 20,577 20,578 20,577

IW Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Prob 1.115*** 1.131*** 1.044*** 1.133*** 1.214*** (0.038) (0.035) (0.031) (0.027) (0.024) McFadden's R2 0.03524 0.04370 0.04483 0.06821 0.09245 Maddala's R2 0.03909 0.05028 0.05319 0.08431 0.11977 AUC (ROC) 0.62444 0.63751 0.63751 0.67054 0.69832

Heat map statistic 80.212% 71.625% 64.465% 55.773% 29.198%

Observations 20,578 20,578 20,577 20,578 20,577

Note 1: Coefficients of probit are displayed as marginal effects Note 2: Heteroscedasticity-consistent standard errors displayed between brackets

6.1 Regression results model 1

The marginal effects of the implied probability are positive as expected, because an event with a high probability is more likely to have a win (y=1) as result. In a completely efficient market however the size of the marginal effect should be unity. The fact that the marginal effect of the implied probability of all models is higher than one, suggest the presence of the FLB, because the probability of winning increases disproportionately with the implied probability of the odds. This is in line with the literature from the literature review. A pattern of declining standard errors can be found with Betfair and all eight bookmakers. The coefficients give merely a first

(25)

22 Table 5.1 – 5.4: P-values test of difference in pseudo-R2 of Betfair and randomly selected bookmaker

impression, while the goodness-of-fit measures give a more complete picture. It is complicated to infer anything about differences between bookmakers based on these results. To test if one market (or bookmaker) is more efficient than the other, regression model 2 is used along with the Brier score, which are discussed later in this chapter. The next part discusses the results of the goodness-of-fit models, which are used to test the hypothesis.

Pseudo-R

2

’s

The results of the bootstrap-method on McFadden’s and Maddala’s pseudo-R2 are represented in table 5. The pseudo-R2 for each betting volume bin is displayed per bookmaker along with the p-values of the standard normal test statistic of a difference between two pseudo-R2’s. The results from Betfair along with one randomly selected bookmaker is represented. The results for the other seven bookmakers can be found in appendix 2. A larger pseudo-R2 represents a better model, i.e. odds are better predictors of the actual outcome.

Betfair Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Bin 1 0.41056 0.98862 0.00001 0.00000

Bin 2 0.39776 0.00039 0.00000

Bin 3 0.00002 0.00000

Bin 4 0.00001

McFadden 0.06397 0.06768 0.06390 0.08430 0.10616

Betfair Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

1 0.15053 0.29610 0.00000 0.00000

2 0.71235 0.00000 0.00000

3 0.00000 0.00000

4 0.00000

Maddala 0.06991 0.07667 0.07487 0.10364 0.13678

IW Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Bin 1 0.01527 0.00643 0.00000 0.00000

Bin 2 0.75604 0.00000 0.00000

Bin 3 0.00000 0.00000

Bin 4 0.00000

McFadden 0.03524 0.04370 0.04483 0.06821 0.09245

IW Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Bin 1 0.00460 0.00035 0.00000 0.00000

Bin 2 0.49129 0.00000 0.00000

Bin 3 0.00000 0.00000

Bin 4 0.00000

(26)

23 As example, for the McFadden pseudo-R2 of the Betfair market, the test statistic of the difference between (bin 1, bin 2) has a p-value of 0.41056 which is larger than 0.05 (standard p-value threshold in economics), meaning the null hypothesis that the pseudo-R2 of bin 1 and bin 2 are equal is not rejected. For the bookmaker, the p-value of (bin 1, bin 2) is 0.01527 which is smaller than 0.05 so the alternative hypothesis that the pseudo-R2’s are different is accepted in favor of the null hypothesis. This means that the pseudo-R2 of bin 2 is significantly larger than the pseudo-R2 of bin 1, supporting my hypothesis that events with more betting volume have odds that are better predictors of the outcome. P-values in green are smaller than 0.05 while values in red are larger than 0.05.

As can be seen in table 5.1 and 5.2 concerning Betfair, the pseudo-R2 of bin 1, 2, and 3 are similar, which is confirmed by the p-values of the difference tests. This indicates no evidence for statistical difference of the pseudo-R2’s. The pseudo-R2’s of bin 4 and bin 5 are however

significantly higher than the pseudo-R2 of all the lower bins. These results can be interpreted as follows: in the Betfair dataset, the efficiency gains from increased betting volume is only visible after a certain level, in this case after bin 3. After that, more betting volume equals a better model according to this goodness-of-fit measure, supporting the hypothesis that matches with higher betting volume have odds that are better predictors of the actual outcome. This evidence can be found in an even stronger form when looking at the pseudo-R2 test of the eight

bookmakers. The pseudo-R2’s differ only slightly in terms of p-value but the conclusion of the difference tests are the same for all eight bookmakers. As can be seen in table 5.3 and 5.4 (and the appendix) concerning the bookmakers, more betting volume equals a higher pseudo-R2 with the exception of bin 2 and 3, where we cannot reject the null that they have the same predictive power. Both bin 2 and bin 3 do have a larger pseudo-R2 than bin 1.

A possible explanation for this could be the way the bins are formed. Because this is done by ordering the entries by volume and dividing the dataset into five bins, it could be the case that two adjourning bins are relatively similar. When looking at the distribution of bin 3, this indeed seems to be the case. The distribution of the volume in bin 3 is slightly skewed to the left, making bin 3 more similar to bin 2 than to bin 4. Alternatively, it could be due to the fact that the

efficiency gains with betting volume are not linear, meaning that a different between $5000 and $10000 could have major efficiency gains but a gain from $10000 to $15000 does not have the same effect on efficiency.

All in all, no significant decline in pseudo-R2 is found as a result of an increase in betting volume, only a stable value or an increase. The overall trend, which can be found in all

bookmakers, is increased efficiency as result of higher betting volume. Therefore I conclude that this part of the results contains evidence in favor of my first hypothesis.

(27)

24 Table 6.1 – 6.2: P-values test of difference in AUC of Betfair and randomly selected bookmaker

Classification-based approach (ROC)

The results of the receiver operator characteristic goodness-of-fit tests are presented in table 6. The area under the ROC-curve for each betting volume bin is displayed per bookmaker along with the p-values of the test statistic (DeLong et al., 1988) of a difference between two AUC’s. The results from Betfair along with one randomly selected bookmaker are presented. The results for the other seven bookmakers can be found in appendix 3. A larger AUC represents a better model, i.e. odds are better predictors of the actual outcome.

IW 1 2 3 4 5 Bin 1 0.03110 0.03007 0.00000 0.00000 Bin 2 0.99970 0.00000 0.00000 Bin 3 0.00000 0.00000 Bin 4 0.00000 AUC 0.62444 0.63751 0.63751 0.67054 0.69832

As example, for AUC of the Betfair market, the test statistic for the difference between (bin 1, bin 2) has a p-value of 0.56600 which is larger than 0.05 (standard p-value threshold in economics), meaning the null hypothesis that the AUC of bin 1 and bin 2 are equal is not rejected. For the bookmaker, the p-value of (bin 1, bin 2) is 0.03110 which is smaller than 0.05 so the alternative hypothesis that the AUC are different is accepted in favor of the null hypothesis. This means that the AUC of bin 2 is significantly larger than the AUC of bin 1, supporting my hypothesis that events with more betting volume have odds that are better predictors of the outcome. P-values in green are smaller than 0.05 while values in red are larger than 0.05.

As can be seen in table 6.1, for Betfair the AUC of bin 1, 2, and 3 are similar, which is confirmed by the p-values of the difference tests, which indicate no evidence for statistical difference of the AUC. The AUC of bin 4 and bin 5 are however significantly higher than the AUC of all the lower bins. These results can again be interpreted as follows: in the Betfair dataset, the efficiency gains from betting volume are only visible after a certain level, in this case after bin 3. After that, more betting volume gives a better prediction according to this goodness-of-fit measure, supporting the hypothesis that matches with higher betting volume have odds that are better predictors of the actual outcome. This evidence can be found in an even stronger form

Betfair Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Bin 1 0.56600 0.32800 0.00117 0.00000

Bin 2 0.11590 0.00719 0.00000

Bin 3 0.00002 0.00000

Bin 4 0.00010

(28)

25 Table 7: Heat Map statistic

when looking at the AUC test of the eight bookmakers. The AUC differ only slightly between bookmakers in terms of p-value but the conclusion of the difference tests are the same for all bookmakers, with the exception of Ladbrokes (LB). For Ladbrokes markets, bin 2 and 3 do not have a significantly different AUC and additionally, at a threshold of 0.05 for the p-value, bins 1 and 2, and bins 1 and 3 are not significantly different, similar to the Betfair markets. As can be seen in table 6.2 (and the appendix), more betting volume equals a higher AUC with the

exception of bin 2 and 3, which are found to be non-different in all markets. Both bin 2 and bin 3 do have a larger AUC than bin 1 in all markets except Betfair and Ladbrokes.

As with the Pseudo-R2, the bins are formed could be an explanation for the results. No decline in AUC is found as result of an increase in betting volume, only a stable value or an increase. The overall trend however, which can be found in all bookmakers, is increased efficiency as result of higher betting volume. Therefore I conclude that this part of the results contains evidence in favor of my hypothesis, in line with evidence of the pseudo-R2 goodness-of-fit test.

Heat Map

The results of heat map goodness-of-fit test (Esaray & Pierce, 2012) is represented in table 7. The heat map statistic for each betting volume bin is displayed. The heat map statistic is the percentage of values that are significantly different from the optimal model, thus smaller values represent a better model, i.e. a model with odds that are better predictors of the actual outcome.

HeatMap Bin 1 Bin 2 Bin 3 Bin 4 Bin 5

Betfair 80.076% 82.992% 66.589% 79.498% 16.635% B365 80.659% 79.021% 67.556% 71.110% 20.129% BW 81.553% 79.672% 60.014% 68.345% 37.362% IW 80.212% 71.625% 64.465% 55.773% 29.198% LB 83.565% 80.187% 67.376% 66.192% 16.465% SB 80.615% 77.559% 68.319% 67.873% 16.003% WH 77.612% 72.646% 48.758% 59.972% 29.635% SJ 78.263% 77.422% 52.768% 67.101% 14.740% VC 80.805% 79.347% 64.480% 71.095% 16.586%

As an example, the heat map statistics for Betfair in bin 1 are 80,1%, meaning that 80,1% of the values are significantly different from the optimal model, while in bin 5 only 16,6% of the values are significantly different from the optimal model, making bin 5 a better model.

According to Esaray and Pierce a heat map statistic above 20% represents a flawed model, because the values are too far away from the optimal model, which is the case in 40 of the 45 markets. This is understandable because only one independent variable is used in the

regression, leaving the model vulnerable to omitted variable bias. When trying to find a model that is a perfect predictor of the outcome, one would include information about the teams, their

(29)

26 current form, injuries, position relative to the opponent, the result selection, and advanced data analysis of previous matches in the regression. I am however not trying to find the model that is a perfect predictor of the outcome, I am merely interested in the predictive power of the odds of betting markets and the effect that betting volume has on this predictive power. Therefore, the value of a single market is not of interest to my research, more so the difference between the models with varying betting volume. What can be said about the value of a single market is that markets in bin 5 have remarkably low values, indicating that these simple models containing only the implied probability of the odds, are deemed a ‘good’ model by the standards of the authors of this method. A disadvantage of the heat map statistic is that no distribution is known so when values are similar, it is not possible to say that one is significantly larger than the other. In combination with the results of previous tests, it is however possible to draw tentative conclusions from the heat map statistics.

The unambiguousness of the result between different bookmakers of the previous goodness-of-fit measures is not that clearly visible in the results of the heat map statistics. For most markets, there seems to be a decrease in value for bins containing bets with more betting volume, because bin 5 is lower than bin 3, which in turn is lower than bin 1 for all markets. Bin 4 however is higher than bin 3 for six of the nine markets, suggesting that bin 3 contains better predicting odds than bin 4, in contrast with previous findings. Furthermore bin 1 and 2 seem to have relatively similar values. I conclude that on average the heat map statistic poses as

evidence, although not conclusive, in favor of my hypothesis.

Conclusion goodness-of-fit measures

All three goodness-of-fit measures described above show similar overall results, which is an increase in the predictive power of a model with higher betting volumes, in accordance with hypothesis 1. Results from both pseudo-R2 and the AUC show this increase for each bin with an exception of bin 2 and 3 which are the same. As mentioned, this could be caused by the way the bins are formed. The heat map results are less clear. There is an increase when looking at bin 1,3, and 5, however bin 4 is a less efficient model than bin 3 according to the results. Because of the lack of significance in the heat map statistic, this is hard to interpret. That is why I conclude that more betting volume has a positive effect on the predictive power of a model.

6.2 Second hypothesis

My second hypothesis is: odds of matches with the highest betting volume are less efficient than odds of matches just below that, because of inexperienced and uninformed bettors entering at the most popular matches. No evidence was found in confirming this hypothesis in the analysis of the pseudo-R2’s, ROC-based statistic and the heat map statistic because bin 4 has a

(30)

27 of the bins. If indeed matches slightly below the most popular ones are more efficient, in my results it could be the case that all these matches are located in bin 5. The efficiency gain from larger betting volume, by comparing bin 4 and 5, perhaps overshadows the potential inefficiency of the most popular matches. This can be tested by creating more bins, for example 100, and comparing bin 99 to bin 100. With 100 bins, bin 99 indeed has a higher AUC and pseudo-R2 than bin 100, however because of the smaller sample size this difference is not found to be significant so these result do not pose as evidence in favor of my second hypothesis.

The mean betting volume of bin 99 is $1,497,000 while the mean betting volume of bin 100 is $3,096,000, which is more than double the amount. The predictive power of both bins being equal puts forward the demand for an explanation, because with fewer bins, the bin with the highest betting volume performed significantly better. It could be the case that the efficiency gained by higher betting volume diminishes after a certain level, making the two bins indeed equally efficient.

An alternative explanation is given by my hypothesis. Two pieces of evidence support this alternative. First of all, the lower (although in my results not significantly) AUC and pseudo-R2 for the highest bin, compared to the second highest bin could indicate reduced efficiency in the absolute highest bin. Secondly, the finding that the last bin has higher values of AUC and pseudo-R2 than the second highest bin when using fewer bins (2 – 7 bins), but for larger number of bins (10+ bins, e.g. 25, 50, 100, 200), the AUC and pseudo-R2 values of the second highest bin seem to overtake the values of the highest bin could be another indication. Further research with a larger dataset is needed to prove or disprove my second hypothesis.

6.3 Results regression model 2

With regression model 2, the probit model from regression 1 is adjusted to test if Betfair odds contain additional information over bookmaker odds. If the ratio, defined as the implied probability of Betfair over implied probability of a bookmaker, has a significant and positive coefficient, the Betfair data contains additional predictive power over the bookmaker, making Betfair a more efficient market. For clarification, if the ratio is significantly positive, this means that when the implied probability of Betfair is higher than the implied probability of the bookmaker (making the ratio positive), the true probability is higher than the probability suggested by the bookmaker alone. This would show that Betfair data contains additional information that improves the model fit. The results of the regression model 2 are found in table 8 below. In all regressions I reject the hypothesis of homoscedasticity so robust standard errors are used.

B365 BW IW LB SB WH SJ VC

Referenties

GERELATEERDE DOCUMENTEN

Combined with the seasonal decrease in breeding success and the lower recruitment probability of chicks hatched late in the season or fledged in poor condition, long- distance

Recalling that betting on away long shots was the least profitable strategy in the weak form efficiency analysis, it comes as no surprise that betting on the away team

- H0) Media news about the Vietnam War will have an influence on the stock market of the United States. - H1) Media news about the Vietnam War will not have an influence on the

A betting exchange that charges up to the standard five percent commission, offers a limited number of sports, does not seem to be targeting any particular

match id match id Unique match id for every match, given by Date PlayerA PlayerB prob 365A prob 365A Winning probability of player A as implied by B365A and B365B diffrank

For the team consisting of two or three robots, we will show that besides the desired formation shape (a line or a triangle), also incorrect shapes (a line with an erroneous distance

Niet alleen justitiële gedragsinterventies staan centraal in neurobiologisch onderzoek ge- richt op crimineel gedrag, maar er wordt ook meer onderzoek gedaan naar de voorspel-

Landelijke Huisartsen Vereniging, ‘Notitie: Bewegingsruimte voor de huisartsenzorg, van marktwerking en concurrentie naar samenwerking en kwaliteit’, 26-05-2015, online via