• No results found

Testing Efficiency in the Dutch Football Betting Market

N/A
N/A
Protected

Academic year: 2021

Share "Testing Efficiency in the Dutch Football Betting Market"

Copied!
86
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Testing Efficiency in the Dutch Football Betting Market

Ben Hoiting

s1545582

M.Sc. Ba Finance Thesis

Faculty of Economics and Business Rijksuniversiteit Groningen

(2)

2 Abstract

This thesis assesses the degree of weak and semi strong form efficiency in the Dutch football betting market using data from the 2000-01 to 2010-11 seasons. An examination of weak form efficiency demonstrates that bettors can benefit significantly from selecting the best odds available among bookmakers. Furthermore, betting on draws and away long shots appears to be unfavourable. In addition the well known anomaly that bettors tend to overbet underdogs is confirmed. These findings and the presence of arbitrage opportunities in the data set are adduced as evidence against weak form efficiency. An analysis of semi strong form efficiency is conducted through the construction of several econometric forecasting models that are able to provide match outcome probabilities. These models include several fundamental variables that are deemed to contain predictive power concerning football match outcomes. A Poisson count model is estimated to predict goal scoring processes that are used in a second step to provide match outcome predictions. The ordered probit model is employed to forecast match results directly. The performance of both models is evaluated on statistical and economical grounds. The ability of several betting strategies, especially that of a refined Kelly betting strategy, to produce systematic abnormal returns ultimately is interpreted as evidence against semi strong form market efficiency.

Key words: football, betting markets, market efficiency, Kelly criterion, Poisson count

regression, ordered probit model.

(3)

3 Content

I. Introduction... 5

II. Literature review... 9

II.A Information efficiency in financial market... 9

II.B Organization of betting markets... 11

II.C Weak form information efficiency in betting markets: biases and arbitrage. .14 II.D Semi strong form information efficiency in betting markets... 19

III. Data... 23

IV. Methodology... 26

IV.A A comparison of bookmaker odds and margin... 26

IV.B Weak form efficiency... 28

IV.B.1 Arbitrage opportunities... 28

IV.B.2 Simple betting rules... 29

IV.B.3 Implied outcomes compared with match outcomes... 30

IV.C Semi strong form efficiency... 30

IV.C.1 The indirect model... 31

IV.C.2 The direct model... 32

IV.C.3 Explanatory variables... 33

IV.C.3.1 Historical team quality... 33

IV.C.3.2 Recent performance... 34

IV.C.3.3 Elimination from the Dutch cup... 35

IV.C.3.4 Elimination from European competitions... 35

IV.C.3.5 Distance between home grounds... 36

IV.C.3.6 Attendances... 36

IV.C.3.7 Significant incentive indicator... 37

IV.C.3.8 Shirt colour... 38

IV.C.4 Construction of estimation and forecasting periods... 39

IV.C.5 Model performance... 40

IV.C.5.1 Ranked probability scores... 40

IV.C.5.2 Probability ratio... 42

IV.C.5.3 The Kelly criterion... 42

V. Results... 45

(4)

4

V.B Weak form efficiency... 47

V.B.1 Arbitrage opportunities... 47

V.B.2 Simple betting rules... 49

V.B.3 Implied outcomes and match outcomes... 50

V.C Semi strong form efficiency... 52

V.C.1 Model estimations... 53

V.C.2 Statistical performance... 55

V.C.3 Economical performance... 56

V.C.3.1 Probability ratios... 57

V.C.3.2 The Kelly criterion... 58

VI. Conclusions... 69

References... 72

(5)

5 I. INTRODUCTION

In November 2004 the American billionaire Mark Cuban presents an idea for a new hedge fund on his weblog1. As opposed to a regular fund that invests in bonds, stocks or other financial assets, this hedge fund will solely use its capital to place bets on sporting events. In other words, it is a gambling hedge fund. Although the comments received by Cuban on his blog are sceptical to say the least, the reasoning put forward for his hedge fund is interesting. Cuban argues that most casual gamblers, who make up for the majority of the market, see their bets as an entertainment experience and expect to lose money. If in a market the majority of people expect to lose money, how efficient can this market really be? Furthermore, Cuban states that bookmakers set odds to attract as much 'emotional money' as possible instead of reflecting the true probability of an outcome. A third argument concerns the amount of information available in the market. Although information from public companies is widely accessible, information available concerning sporting events might be even more abundant and easier to acquire and interpret. Local papers, television, radio and the internet provide tons of easy accessible recent information regarding injuries, suspensions, the 'form' of teams etcetera. One can even tape complete matches for extensive analysis. Cuban appeared to be ahead of his time with his idea as in 2010 the London based investment company Centaur launched the first ever sports betting hedge fund2.

Clearly, considerable parallels between financial and betting markets exist. Grant et al (2008) even deposit the bold statement that:

"Betting markets...are no longer distinct, even superficially, from other investment markets" (Grant et al (2008)).

For a sound and successful investment strategy, be it in a financial market or a betting market, an opinion about the reasons that make the market inefficient is required. In the case of betting markets, professional gamblers often mention Cuban's argument about how bookmakers set their odds as an explanation for inefficiencies in the market. For financial markets in general two types of investment strategies with different views on market efficiency exist: those based on technical and those based on fundamental analysis. Investors that use technical analysis believe that stock prices from the past can be used to predict future

1

Available online at blogmaverick.com, blog of November 27, 2004 titled: My New Hedge Fund.

2

(6)

6

(7)

7

These developments and arguments indicate the relevance of further research regarding efficiency in sports betting markets. Additionally, studying markets with settings similar to financial markets can provide valuable insights in human behaviour and market forces in a financial context. The present paper examines the efficiency of one of the largest sports betting markets in Europe: betting on football games. More specifically, odds provided by online bookmakers on football matches played in the Dutch Eredivisie in the 2000-01 to 2010-11 seasons are used. Betting on league football is selected for its simplicity and data availability. Furthermore, due to its popularity the liquidity and efficiency in these markets are considered to be high. The focus in this thesis is on the Dutch league as it has not been examined in detail in research before. In addition, efficiency might be lower in this market in particular as a result of the fact that most betting agencies are UK-based and therefore presumably pay less attention to smaller and less prestigious leagues as the Dutch league. Kochman and Goodwin (2004) do not find evidence for a neglected firm effect3 in betting markets though. To examine semi strong form efficiency, publicly available data is collected in order to construct forecasting models that are able to attach probabilities to future match outcomes and goal scoring processes. These probabilities are then compared with the corresponding probabilities provided by bookmakers. Rank probability scores are computed to determine whether the forecasted probabilities are statistically accurate. Betting strategies are designed to test whether the forecasts of the models can be utilized to systematically obtain positive returns. Prior to this approach, analyses of arbitrage opportunities and various weak-form efficiency tests are conducted.

This thesis attributes to the betting market efficiency literature in a number of ways. Most importantly, it is demonstrated that the Dutch football betting market is not efficient as systematic profits can be generated by employing several betting strategies in combination with model forecasted match outcome probabilities. In addition, the size and timeliness of the composed dataset is unprecedented. Furthermore, the estimated match forecasting models contain several variables not used in previous studies as for example the participation of teams in European competitions and the colour of the shirts teams wear. These variables add to the accuracy of the probability forecasts. Moreover, the criteria used to determine the degree of market efficiency are more refined and more suitable than those applied in prior

3

(8)

8

research. Lastly, this thesis allows for a direct comparison between the direct and indirect methods of forecasting match outcomes.

(9)

9 II. LITERATURE REVIEW

II.A. Information efficiency in financial markets

There is an extensive body of literature concerning information efficiency in financial markets. This concept today is referred to as the 'market efficiency hypothesis'. Fama (1991) defines it as:

"The simple statement that security prices fully reflect all available information" (Fama

(1991)).

Research in this area goes back to the work of Bachelier (1900) who concluded from the examination of the behaviour of securities on La Bourse, that price changes were identically and independently distributed, implying that the next movement in a particular time series could not be estimated using prior movements. Bachelier furthermore concluded that stock price movements follow a normal distribution and that price changes followed a random walk. This proposition has become the foundation of much successive work. It is not until 1953 however (due to the emergence of the computer) that relevant research on the behaviour of security prices continued. Kendall (1953) examined the behaviour of weekly changes in nineteen indices of British industrial share prices and spot prices for cotton and wheat. He as well finds empirical evidence that series of speculative prices may be well described by random walks.

Fama (1970) uses the work of Roberts (1959) to show that market efficiency can exist independent of price dependence. Fama (1970) defines three forms of efficiency. Weak form market efficiency refers to the notion that asset prices are path independent, implying that past prices cannot be used to predict future prices. Semi strong form efficiency is subject to the extra requirement that all public information is accounted for in asset prices. Finally, strong

form market efficiency assumes that even inside information is reflected in the price of an

asset.

(10)

10

negative, resulting in some evidence rejecting weak form efficiency. Filter rule test are frequently used in testing for weak form market efficiency as well. The researcher determines an investment rule where the filter is the percentage that a stock price has to rise (fall) in order to buy (sell) a particular stock. The return of this strategy is compared with that of a buy-and-hold strategy. This approach is followed by among others Fama (1965) and Fama and Blume (1966). They find no abnormal returns, which is interpreted as evidence in favour of weak form efficiency. Cyclical tests test for cyclical behaviour in time series. The well known January, Friday and Monday effects are results of this method. Rozeff and Kinney (1976) were the first to report evidence indicating that January stock returns were significantly higher than stock returns in other months. French (1980) and Gibbons and Hess (1981) conduct cyclical tests as well but find inconclusive results. Finally, volatility tests are used to test weak form market efficiency. The main assumption underlying these tests is that expected returns are constant and shocks in expected dividends are the only sources of variation in stock prices. LeRoy and Porter (1981) and Grossman and Shiller (1981) employ this method and their findings are commonly interpreted as evidence against market efficiency.

(11)

11

Research concerning strong form efficiency is less voluminous as it is considerably more difficult to observe private information. Jaffe (1974) finds that for insiders the stock market is not efficient and that they possess information that is not yet reflected in stock prices. Penman (1982) finds evidence that corporate insiders time their trades relatively to the announcement of their firm's earnings prospects resulting in abnormal returns. Brown et al. (1991) examine strong form efficiency in the Toronto stock market by focusing on the stock forecasts of several brokerage firms' financial analysts. These forecasts indeed contain significant predictive information about future stock returns. Brown et al. (1990) argue that this can be attributed to the information they obtain at the regular company visits these analyst have to undertake. Meulbroek (1992) examines by the Securities and Exchange Commission convicted cases and finds an abnormal return of three percent on insider trading days. Furthermore, nearly 50 percent of the pre announcement rise in stock prices for stocks involved in takeovers occurs on insider trading days.

Overall the consensus concerning efficiency in financial markets is that information is accounted for properly and in a fast paced manner with few exceptions4. According to Schwert (2003) it is especially interesting to see that many well known anomalies fail to hold up in different sample periods, reflecting movement towards efficiency. The weekend and dividend yield effects for example seem to have gradually disappeared after publication. Despite the enormous body of literature concerning the efficient market hypothesis most research runs into a major obstacle identified by Fama (1970) as the joint hypothesis problem. Testing for efficiency always goes hand in hand with testing for some form of equilibrium model, an asset pricing model. Hence, it is ambiguous to determine whether observed inefficiencies are truly 'market inefficiencies' or mere flaws in the relevant pricing model. The well known size and value effects for example were initially marked as inefficiencies, whereas Fama and French (1992,1993) later argued that these effects were risk factors that were missing from the Sharpe Litner Mossin capital asset pricing model resulting in the development of the Fama French three factor model. Betting markets do not suffer from this imperfection.

II.B. Organization of betting markets

As mentioned in the introduction betting markets are similar to financial markets in numerous ways. Both are characterized by many participants on both the demand and supply side.

4

(12)

12

Furthermore both markets in fact are a zero sum game (if one gains, another loses) and in both markets large amounts of money at stake. Before examining the main results concerning efficiency in betting markets it is necessary to explain what betting markets are and how they are organized. Chinn (1991) defines a bet as:

"Something to be staked or won on the result of a doubtful issue and all bettors hope that the result of

their bet will be resolved in their favour and so give them a remuneration" (Chinn (1991)).

Common examples of such issues are horse races and sport matches, although virtually any event with an uncertain outcome can be the subject of a bet5. For a bet to take place obviously at least two parties are needed. These can be two individual bettors or an individual bettor and a bookmaker. For racetrack betting there are in general two ways in which bets are offered: pari-mutuel and through bookmakers. Pari-mutuel betting implies the pooling of gamblers' money which, after subtraction of transaction costs and commissions, is later divided proportionally among the winners. Most tracks in the USA offer this type of betting. In the UK bookmakers are more commonly employed to form an odds market6. In essence, a bookmaker is quite similar to a professional bettor although there are two important differences. First, bookmakers are referred to as price setters and are in the business of taking bets rather than placing them. Secondly, where bettors usually bet with one or a few bookies, a bookmaker typically takes bets from numerous bettors. For sports betting markets the most important betting forms are point spread betting and fixed odds betting. In point spread betting markets the bookmakers try to even the odds between two teams of unequal merit. The stronger team then has to win by a certain amount of points for a bet to win. For example in the Super Bowl of 2010-11 between the Pittsburgh Steelers and the Green Bay Packers the spread was on average -2.5 for the Packers and hence +2.5 for the Steelers. Thus when a bettor would have elected to bet on the Steelers, the Steelers had to beat the Packers by more than 2.5 points for his bet to win. Point spread betting is commonly used in sports as American football, basketball, and ice hockey. For fixed odds betting markets bookmakers provide a fixed price for every match outcome. This system is more popular for betting on for example football matches and is examined and explained in detail further in this thesis. Interesting writings nowadays are available about how bookmakers set their prices. The 'traditional' view of bookmaker behaviour is that bookmakers do not care about the outcome

5

William Hill for example offered quotes on various events related to the wedding of Prince William and Catherine Middleton on February 24, 2011.

6

(13)

13

of an event, but are extremely good at predicting which price(s) will result in even proportions wagered on all outcomes. Lee and Smith (2002) support this view and give a sound explanation for a point spread betting market:

"Bookies do not want their profits to depend on the outcome of the game. Their objective is to set the

point spread to equalize the number of dollars wagered on each team and to set the total line to equalize the number of dollars wagered over and under. If they achieve this objective, then the losers pay the winners $10 and pay the bookmaker $1, no matter how the game turns out" (Lee and Smith

(2002)).

Levitt (2004) however finds evidence for a different view on bookmaker behaviour. He first notes that there is a difference in how prices in the financial market arise and how odds quoted by bookies arise. In financial markets, prices change frequently and significantly to adjust for supply and demand. In betting markets the adjustments of odds provided by bookmakers are smaller and less frequent. If bookmakers therefore have failed to offer the market clearing price, they are vulnerable to considerable risk. Levitt continues to argue that although this price setting mechanism seems risky, there are at least three possible scenarios in which the bookmaker can keep making profits using this price setting mechanism. The first scenario is the before mentioned 'traditional' view in which prices are set to balance the number of bets on every outcome. The second scenario is that bookmakers do care about outcomes and that they are systematically better at determining these outcomes than punters. Bookmakers then could set the 'correct' price and charge a commission. A bookmaker's earnings in the long run then accumulate to the number of bets made times the charged commission. The third possibility is a combination of both price setting mechanisms. If bookmakers are able to predict the distribution of bets made and are better in predicting the outcomes of games they can systematically offer 'wrong' prices as to take advantage of the preferences of bettors, increasing their profits. Levitt (2004) is able to examine which of the three scenarios deserves the most support as he possesses an unique dataset that consist not only of prices (odds) of bets on NFL7 matches, but also of the quantity of bets wagered on every outcome. He finds strong evidence against the first price setting scenario and argues that bookmakers appear to strategically set prices to exploit bettors' biases, allowing them to increase their gross profits by 20 to 30 percent. Franck et al. (2011) develop a theoretical model which shows that the pricing decision of a profit maximizing bookmaker depends on the occurrence of bettor sentiment and the elasticity of demand he is facing. They find

7

(14)

14

evidence of such behaviour by examining 16,000 English football games. Other works that confirm the findings of Levitt (2004) include Paul and Weinbach (2007, 2008, 2010). The insight that economizing on preferences of bettors plays an important part in bookmakers' price setting decisions is crucial in developing a profitable betting strategy as is attempted in this thesis. Clearly, if an intelligent bettor would be able to identify this 'mispricing' behaviour bookmakers face tremendous financial risk.

II.C. Weak form information efficiency in betting markets: biases and arbitrage

Tests of information efficiency in betting markets are available in many forms. The earliest papers in this direction arise in the late forties and are mostly of an experimental nature. The first empirical studies make use of data from traditional race track betting. More recent papers often use data from sports betting markets, where football, tennis, ice hockey, and American football are popular in particular.

The definition of weak form information efficiency in financial markets is stated in section II.A. In short it is the notion that current prices incorporate all information available in past prices and price movement. Therefore, it should not be possible to obtain abnormal returns by means of a strategy that uses past prices to predict future prices, if the market is weak form efficient. In research regarding efficiency in betting markets this idea is generally adapted as to examine the possibility for earning differential returns in the future, by using information contained in prices of the past, where prices in this case are in the form of odds.

(15)

For odds below 3:10 ($1.- to w

costs, returns would already be positive for is quoted 100:1 on the other hand

In other words the real odds for such a wager would be about 730 to 1 however confirm the existence of a

(1988) fail to find evidence for a Busche (1994) confirms the res

viewed more as the exception that proves the rule than as serious evidence against the favourite bias, not for nothing Busche and Hall (1988) begin their

report an anomaly of an anomaly.

Research building upon the findings of for example Ziemba and Hausch (1986) turns to other betting markets. Cain et al. (2000) test for market efficiency in the UK football betting market and conclude that the favourite

data from the European football betting market find similar results. Woodland and Woodland (1994) and Woodland and Woodland (2001), on the other hand surprisingly report the exact opposite. By means of data regarding the betting markets on NHL

a 'reverse favourite long shot bias'

8

US National Hockey League and Major League

15

to win $1.30) the expected returns are positive. Without transaction returns would already be positive for odds below 9:2 ($1.- to win $4.

is quoted 100:1 on the other hand the expected return drops to 13.7 cents per wagered dollar. In other words the real odds for such a wager would be about 730 to 1

however confirm the existence of a favourite bias in racetrack markets. Busche and Hall (1988) fail to find evidence for a favourite bias using data from Hong Kong race

Busche (1994) confirms the results of Busche and Hall (1988). Nevertheless these articles are viewed more as the exception that proves the rule than as serious evidence against the bias, not for nothing Busche and Hall (1988) begin their article by stating they n anomaly of an anomaly.

upon the findings of for example Ziemba and Hausch (1986) turns to other betting markets. Cain et al. (2000) test for market efficiency in the UK football betting market

favourite bias is also present in this market, Vlastakis et al.

data from the European football betting market find similar results. Woodland and Woodland (1994) and Woodland and Woodland (2001), on the other hand surprisingly report the exact opposite. By means of data regarding the betting markets on NHL and MLB

shot bias' implying that instead of overbetting

US National Hockey League and Major League Baseball respectively.

30) the expected returns are positive. Without transaction to win $4.50). When a horse 7 cents per wagered dollar. In other words the real odds for such a wager would be about 730 to 1. Not all papers bias in racetrack markets. Busche and Hall bias using data from Hong Kong race tracks. theless these articles are viewed more as the exception that proves the rule than as serious evidence against the article by stating they

(16)

16

participants overbet the favourite. The question which factors cause the favourite long shot bias hence arises.

(17)

17

Several anomalies similar to the favourite bias are documented in the literature. Inefficiencies regarding the timing of bets for example are well known. Ottaviani and Sorensen (2005b) mention that large amounts of money are placed just before post time, but sizeable amounts are placed much earlier as well. If more information becomes available later, then why are so many bets placed well before post time? They refer to this phenomenon as the puzzle of early betting. Gandar et al. (2001) note that later bets tend to contain more information about the horses' finishing order than earlier bets.

Regarding sports betting markets the home field advantage has been the topic of various papers. Amoaka-Adu et al. (1985) find positive returns when the betting rule: bet on the home team when it is the underdog is employed. Golec and Tamarkin (1991) come to a similar conclusion, however neither study gives a theoretical explanation. Schwartz and Barsky (1977) explain why home playing teams should perform better: they take advantage from learning factors (e.g. familiarity with the pitch and stadium), do not have to travel and have the support from a larger part of the crowd. Courneya and Carron (1991) add that referees often judge in favour of home teams. Baumeister and Steinhilber (1984) however find that professional baseball and basketball teams play unusually bad (i.e. choke) in decisive home games. Vergin and Sosik (1999) examine sixteen NFL seasons and conclude that bettors seem to misprice the home field advantage in matches with national focus. However, Gandar et al (2001) revisit the paper by Vergin and Sosik (1999) and find little evidence of this bias in the NBA9 and the NFL betting markets. They state that they remain unconvinced by the existence of a home team bias in games with a national focus.

Another unique property of sports betting markets besides the distinction between home and away games is that bettors might feel a strong sentiment for a particular team or individual. This could result in bettors overestimating the chance that their team will be victorious. Therefore, more bets will be placed on teams with such large 'fan bases'. If bookmakers pick up on such behaviour they are likely to lower odds on these teams. Avery and Chevalier (1999) are among the firsts to mention such a 'sentiment bias'. They use data from the NFL betting market to test their thesis that losses from backing 'prestigious' teams are abnormally high. Avery and Chevalier (1999) indeed find support for their hypothesis. Kuypers (2000) is the first to note the possibility that bookmakers may adjust odds to account for the presence of committed supporters. He builds a theoretical model that takes bookmaker behaviour as a

9

(18)

18

starting point. He shows that it is profitable for bookmakers to offer lower than efficient odds on teams with large fan bases. Surprisingly, Kuypers (2000) does not test his hypothesis that odds will vary with levels of supporters. Forrest and Simmons (2008) take the work of both Avery and Chevalier (1999) and Kuypers (2000) as a base for their model regarding the betting market on Spanish football. They estimate a probit regression using bets available on Spanish premier league matches. Their findings challenge the proposition made by Kuypers (2000). The reason for this contradiction between theoretical prediction and empirical results as ventilated by Forrest and Simmons (2008) is that the developed models contain flaws with respect to the decision process of bettors. Fans of a certain team might have a different view on what product is being offered than bookmakers. Fans are likely to see 'a bet on their team' as the product being sold instead of 'a bet on a match'. The assessment they make is how much to bet on 'their' team instead of on which team. Kuypers (2000) assumes that the response of 'fan bettors' is to switch bets to the opponent at some equilibrium price. This however might be as unlikely as an Ajax fan buying a Feyenoord shirt when Ajax shirts become more and more expensive. Franck et al. (2011) develop and test another bookmaker pricing model. They conclude that bookmakers will increase prices (lower odds) in the presence of price-insensitive sentiment bettors and can increase profits by lowering prices (higher odds) when sentiment bettors are price-sensitive. Braun and Kvasnicka (2011) search for national sentiment using data from the UEFA Euro 2008 championship. They find several biases in the pricing behaviour of bookmakers regarding own national teams, although the signs of these biases appear to differ across countries. Although the findings on the influences of a 'fan bias' are far from unanimous it appears to be an important factor in bookmakers' pricing decision processes.

(19)

19

efficiently accounted for in bet prices. In particular bets on home teams playing in the coldest temperatures are profitable.

Whereas a large amount of attention in the literature is given to biases and anomalies (the favourite bias in particular), arbitrage is a subject dealt with often as well. Many studies of efficiency to some extend analyze the presence of arbitrage opportunities in their data. Hausch and Ziemba (1990) develop an arbitrage model in the U.S. cross-track betting market on major horse races10. They find considerable arbitrage opportunities that provide risk free returns up to 15 percent. They assign this mispricing by the public to the limited access to information, and to the fact that certain horses are more familiar to bettors in certain regions than in other regions. For example, a horse well known in England that ended up winning the 1983 Arlington million was quoted 4-1 to win in England and 38-1 (!) to win at Arlington Park, Illinois.

Edelman and O'Brian (2004) use a game theoretic approach on data from the Australian thoroughbred race track betting market and conclude that arbitrage opportunities appear fairly regular. Pope and Peel (1989) find arbitrage opportunities in the UK football betting market up to two percent after taxes. Vlastakis et al. (2009) find few arbitrage opportunities in their examination of the European football betting market. The opportunities that do occur nevertheless result in significant positive returns averaging 21.78 percent. If only online bookmakers are used the opportunities for arbitrage become even fewer in number. Franck et al. (2012) use odds quoted by traditional bookmakers in combination with odds from newly formed betting exchanges on European football matches, to find arbitrage opportunities. Their method results in a guaranteed positive return in 26 percent (!) of the matches.

II.D. Semi strong form information efficiency in betting markets.

The studies mentioned in the previous section in general use the information contained in odds from previous events to evaluate the efficiency of betting markets, although some (e.g. Kuypers (2000)) include a few variables reflecting public information. Research that tests for semi strong form efficiency in betting markets is much less common. The most widespread form of testing for semi strong form efficiency is through evaluation of the profitability of betting systems that are based on widely available information. These systems are different from those based solely on information contained in the odds and are more complex to

10

(20)

20

construct in general. Vergin (1977) examines six betting strategies on horse races run in California. Of these only the 'elimination rule' developed by McQuaid (1971) produces systematic profitable returns. Following this rule, one should not bet on (eliminate) a horse that does not meet certain requirements as for example whether it already has won a race before. Quirin (1979) shows that the inside six positions win more often than expected, the inner position being the most beneficial overall. Beyer (1983) notes that at oval tracks shorter than a mile the front runners and inside starters have a significant advantage. Canfield et al. (1987) confirm these results using a three year Canadian dataset. However, they conclude that the public also overbets this bias, eliminating all positive returns. Furthermore, they find that starting in the inner lane can instead be disadvantageous on rainy days as the rain accumulates near the rail, turning the sand into mud. The public does not account for this phenomenon. Bolton and Chapman (1986) develop a multinomial logit model for horse racing processes using a dataset consisting of 200 races. Among the variables in their discrete choice model are horse and jockey characteristics as well as several race specific features. The signs of the estimated coefficients in their model are in accordance with their a priori theoretical reasoning. They test the usefulness of their model by employing several betting strategies. The returns from these strategies were significantly higher than those from a random betting strategy, although not profitable. By adding some additional constraints including the elimination of bets on long shots positive returns were achieved, although Bolton and Chapman (1986) admit that their sample size might be too small for strong conclusions. Edelman (2003) uses competitive ratings to estimate a weighted least squares regression, resulting in probability forecasts for horses in Australian sprint races. His model appears to function particularly well and gives rise to some reliable and profitable betting strategies. He too acknowledges that a larger database including races of other distances would result in an even more powerful model.

(21)

21

Poisson model is then conducted to improve the fit of the model. The model of Maher (1982) is extended by Dixon and Coles (1997) as they also use Poisson models with parameters related to past performance to forecast goals scored by home and away teams. Their model is estimated using data from the 1992-93 to 1994-95 English premier league and cup seasons. They make some small refinements to improve the realism and precision of the models. The estimated probabilities are used as input for a rather simple betting strategy that places bets on matches for which the probability estimated by the model exceeds the probability of the bookmaker by a specified level. The betting strategy is used on the 1995-96 out of sample season and provides consistent positive returns.

Rue and Salvesen (2000) build a modified Poisson model as well. They use a Bayesian generalized linear specification to estimate the parameters. They include separate defensive and attacking abilities of teams and a variable that reflects the psychological effect of stronger teams underestimating the weaker competition. Furthermore, they argue that only the first 5 goals scored by a team reflect information about the capabilities of teams and therefore truncate the number of goals scored by teams to five. A match that ended in 6-3 for example is interpreted as 5-3. The data used by Rue and Salvesen is from the 1997-98 English First Division and Premier League seasons. The estimated probabilities are compared with odds from the online bookmaker Intertops. Positive returns of 39.6 and 54.0 percent are generated for the Premier League and First Division seasons respectively.

The use of discrete choice regression models, which are able to predict match outcomes directly, has increased significantly in the literature recently. Kuk (1995) is often referred to as the first paper that takes match outcomes as dependent variable. Kuk uses methods of moments to estimate an ordered probit model. He does not provide out of sample estimations and uses only the final ranking of teams at the end of season as data input.

(22)

22

bookmaker with some pre-specified factor . This results in positive post tax results up to 32 percent. Kuypers (2000) thus finds strong evidence for semi strong form market inefficiency. Goddard and Asimokopoulos (2004) construct an ordered probit model using data from ten English Premier League seasons. In addition to past match results, the participation of teams in cup football, geographical distance and the level of significance of a match for a team contribute to a more accurate forecasting performance. Forecasts are generated for two out of sample seasons. A betting rule then is applied which involves placing a one pound wager on the match for which the model's ex ante expected return is the highest. Positive returns of 8 percent are generated in matches played in April and May, which is interpreted as particularly strong evidence of market inefficiency. Forrest et al. (2005) employ the same framework and methodology as Goddard and Asimokopoulos (2004). They conclude that the predictions of bookmaker are increasingly becoming more accurate over time, implying that the market is moving towards efficiency. They account this phenomenon to the increasing level of competition among online bookmakers.

(23)

23 III. DATA

The major part of the data is collected from the website www.football-data.co.uk. This website is one of the most extensive and accurate providers of historical odds and results of major European football leagues. The sources for additional data required for the construction of several explanatory variables are mentioned in the relevant methodology sections. The final dataset contains data from eleven seasons of football in the Dutch Eredivisie (season 2000-01 until season 2010-11). This league is selected as it is one of the most attractive leagues in Europe with a large number of average goals (3.00 for the total dataset) per match, making it an interesting league for punters. Additionally, the fact that most major bookmakers (William Hill, Ladbrokes, etc.) are UK based, might contribute to significant results as the forecasts of bookmakers for a smaller foreign league presumably are less accurate. Finally, this league is not examined in detail before as the present football betting literature almost exclusively focuses on the English Premier League.

The Dutch Eredivisie is the top football league in the Netherlands. The league consists of eighteen professional football clubs that play two matches against all opponents, once at home and once away. Every win results in three points, a draw gives one point to both teams and a loss results in no points at all. The team with the most points at the end of the competition is crowned the champion. The team that finishes last is relegated to the first division. The teams that finish sixteen and seventeen have to participate in play off matches against teams from the first division to avoid relegation. The competition usually starts in August and ends in May. After the deletion of incomplete cases (nine matches), the odds and results of 3,357 matches are included in the dataset.

Table A: Match outcomes in the 2000-01 to 2010-11 seasons

Frequency Percentage Away win 949 28.3% Draw 762 22.7% Home win 1646 49.0%

(24)

24

Table B: Goals scored by home and away teams in the 2000-01 to 2010-11 seasons

Home team Away team

Goals Frequency Cumulative Frequency Cumulative

0 690 20.6% 1118 33.3% 1 983 49.8% 1073 65.3% 2 781 73.1% 694 85.9% 3 493 87.8% 297 94.8% 4 238 94.9% 121 98.4% 5 102 97.9% 43 99.7% 6 48 99.3% 9 99.9% 7+ 22 100.0% 2 100.0%

The online bookmakers in the dataset that provide odds are Bet 365 (A), Blue Square (B),

Bet&Win (C), Gamebookers (D), Interwetten (E), Ladbrokes (F), Sportingbet (G), Stan James

(H), Victor Chandler (I) and William Hill (J). All bookmakers are European based and are active up to this moment. These firms make up for a large proportion of the European betting market. William Hill and Ladbrokes for example are market leaders based in the UK with revenues of over one billion pound in 201011. Furthermore, seven out of the ten bookmakers are named in the 2011 Power 50, a well respected list constituting the 50 most influential internet gaming companies, published by eGaming Review Magazine12 leading publisher of

news concerning the online gambling industry.

Table C: Number of matches a bookmaker posted odds for per season

A B C D E F G H I J Total 2000-01 305 302 300 303 1,210 2001-02 227 293 237 293 1,050 2002-03 300 306 294 302 297 1,499 2003-04 305 306 299 306 302 1,518 2004-05 306 305 306 302 255 306 301 2,081 2005-06 306 306 306 302 300 306 306 300 302 2,734 2006-07 299 299 299 297 295 299 299 293 299 2,679 2007-08 306 305 306 306 305 305 305 306 304 225 2,973 2008-09 306 306 306 306 303 306 306 306 304 303 3,052 2009-10 306 305 306 306 304 306 306 306 306 303 3,054 2010-11 306 306 306 306 306 306 306 306 305 306 3,059 Total 2,740 1,222 2,134 3,279 3,307 2,073 3,279 1,829 1,812 3,234 24,909

The distribution of the number of quotes per bookmaker per season can be obtained from

11

Corporate reports Ladbrokes and William Hill 2010.

12

(25)

25

table C. In total 24,909 different sets of odds for a home win, away win and a draw are available. In the first season odds from only three bookmakers are on hand, more bookmakers enter the dataset in later seasons. All odds in the dataset are so called 'closing odds', which means that these were the odds bookmakers quoted just before they stopped accepting bets. The odds used in this thesis are quoted in the European (decimal) fashion. When a bookmaker for example offers 1.50 on a win by the home team, a punter that chooses to place a bet on that outcome receives 1.50 times his original stake, for a return on investment of 50 percent, when successful. However, he loses this stake when the match ends in a draw or an away win. Descriptives regarding odds for the entire dataset are presented in table D below. Interesting is the relatively lower level of average odd for draws in comparison with away wins as table A shows that more matches end in away wins than in draws.

Table D: Descriptives of odds in the 2000-01 to 2010-11 seasons

Minimum Maximum Mean Standard Deviation Home 1.04 17.00 2.37 1.37

(26)

26 IV. METHODOLOGY

The central objective of this thesis is to determine whether the betting market on Dutch football matches is efficient. This issue is divided in two subsections. First an analysis of weak form efficiency is set out by means of rather simple and straightforward techniques. A more sophisticated method is used to examine the degree of semi strong form efficiency. The main conclusions of this thesis are based on the results of this latter methodology. To answer the described research question it is important to clarify when a betting market is efficient. In the literature in general two views on this subject exist. For one it is argued that a market is efficient when no strategy exists that results in returns different from the bookmaker’s margin. The more restrictive view takes the possibility for consistent positive returns as a sufficient determinant of market inefficiency. The final verdict regarding efficiency in this thesis is based on this second, more limiting viewpoint. Before I turn to the examination of weak and semi strong form efficiency, research is conducted regarding the equality of bookmakers in providing odds to the public.

IV.A A comparison of bookmaker odds and margin

An interesting first analysis of the data is an examination of differences between bookmakers with respect to the odds they provide and the margins they require. If the market is efficient, differences in odds across bookmakers should be small. If this is not the case, punters can benefit considerably from selecting among bookmakers. A first step is to inspect the correlation between the odds quoted by the bookmakers. Next, differences are analyzed by conducting an ANOVA. Finally, the odds are regressed on each other through simple OLS estimations. If the constants are significant and the coefficients are not equal to one, differences between the odds provided by bookmakers are present and thus it pays for punters to select a favourable bookmaker.

Noting that bookmakers post different odds does not per se imply that bookmakers have different views on the probabilities of match outcomes. The price implied probability of a bookmaker regarding an outcome simply is equal to reciprocal of its odds. Suppose that a bookmaker provides odds of 2.25, 3.30, and 2.88 for a home win, draw and away win respectively. The price implied probabilities of this bookmaker then are:



(27)

27

Notice that the these figures do not add up to unity but to slightly more than that. This is the result of bookmakers not offering 'fair' prices. Instead they include a bookmaker margin or overround in their odds. If some bookmakers simply require a larger margin or overround, they are likely to set 'worse' odds but might have the same expectation about the probabilities of match outcomes. Such a bookmaker margin on an event with n outcomes can be formulated as:

 = 1 − ∑ 

  . .  (1) where  is the expected gain of a bookmaker on an event with  outcomes,  represents the probability of an outcome,  are the quoted odds of all outcomes and  is the percentage of bets on each outcomes. A more intuitive interpretation of this bookmaker margin arises if we approach it from the punter's perspective. It then resembles the expected loss a punter makes when he chooses to bet an equal amount on all possible outcomes (home win, draw, and away win). Unfortunately, equation (1) requires us to know the distribution of bets across outcomes. Whereas the odds on every possible outcome are publicly available, the number of bets on each outcome is not. This is often solved in the literature (for example by Vlastakis et al. (2009)) by calculating an implied margin ′:

 = ∑   

  − 1 = ∑  !" # − 1 (2) This implied margin assumes that bets are equally distributed across the possible outcomes and that the odds are set according to the true probabilities. Returning to the bookmaker that provided odds of 2.25, 3.30, and 2.88 for a home win, draw and away win respectively, his implied bookmaker margin is equal to:

 . +  . +  . − 1 = 0.0947 or 9.47%

To obtain the 'true' implied probabilities this bookmaker attaches to all three outcomes we have to correct the before calculated price implied probabilities for this bookmaker margin:

 .⁄ .'()= 0.4060,  . ⁄ .'()= 0.2768 and  . ⁄ .'()= 0.3172

(28)

28

in bookmaker margin are evaluated between bookmakers and across seasons by conducting a two way ANOVA.

IV. B Weak form efficiency

IV. B.1 Arbitrage opportunities

A logical starting point in the examination of market efficiency in betting markets is the search for arbitrage opportunities. In a completely efficient market arbitrage opportunities are assumed to be absent. Research regarding the presence of arbitrage opportunities in race track betting markets often concludes that these opportunities are absent or very small (Hausch and Ziemba (1990)). However, considerable evidence regarding arbitrage opportunities in sports betting markets is available that supports the presence of significant arbitrage opportunities. For example, Vlastakis et al. (2008) find arbitrage opportunities (although rare) in the European football betting market fluctuating from 12 to 200 percent. Franck et al. (2010), using the same market, find a positive risk free return in 26 percent of the matches when bookmaker odds are combined with those available on betting exchanges.

An arbitrage opportunity with respect to a bet on a football match can arise only when there are multiple bookmakers providing odds on the same match. As mentioned, when a bettor places an equal amount of money on all the outcomes quoted by one bookmaker he will always make an expected loss equal to the margin of the bookmaker. Therefore, for an arbitrage opportunity to arise a bettor should combine the maximum (best) odds quoted by different bookmakers. When the combined margin of the bookmakers is negative, then an arbitrage opportunity exists. Formally we have:

- = ./ ∑ 

 0123∈5 673 8 − 19 < 0 (3) where max d?@ resembles the best quote on outcome i available in the set of bookmakers j . I will illustrate this principle by the following example.

(29)

29

Table E: Arbitrage opportunity example

PSV (home win) Draw Ajax (away win) Odds by bookmaker A 1.9 3.1 4.2

Odds by bookmaker B 1.6 3.4 6

Best available odds 1.9 3.4 6

Price implied probability:  0.5263 0.2941 0.1667 Proportions to be placed: DC"

"EFC" 0.5332 0.2980 0.1688

The maximum quote for a win by PSV is offered by bookmaker A (1.9), the best quotes for a win by Ajax and a draw are quoted by bookmaker B (6 and 3.4 respectively). Equation (3) now gives: - = -0.013. There thus is an arbitrage opportunity. To exploit this opportunity a combined bet must be structured. The total value of this bet should be divided between individual stakes on every outcome in proportions equal to the implied probabilities of every outcome (Vlastakis et al (2006)). In equation form we have:

G

H = C" ∑D

"EFC" (4)

where H is the proportion of the total bet that should be placed on outcome I. Applying G equation (4) to the example above results in weights as shown in the fifth row of table E above. Suppose now that a bettor has an amount of $100.- that he would like to invest. In that case the weights and pay offs are as provided in table F below. In this particular case there is an arbitrage opportunity resulting in a 1.31 percent risk free profit by combining odds from two bookmakers, no matter what the outcome of the match will be.

Table F: Weights and pay offs of the arbitrage opportunity example

Home win Draw Away win Amount to Invest: $53.32 $29.80 $16.88 Return: A $101.31 Return: B $101.31 $101.31 Total Investment -$100.00 -$100.00 -$100.00 Profit $1.31 $1.31 $1.31 Return on Investment 1.31% 1.31% 1.31% IV.B2 Simple betting rules

(30)

30

designed. The returns from these strategies are reported and analyzed. The strategies that are explored are:

1. Betting only on the home team, away team, or a draw. 2. Betting on the favourite or betting on the underdog. 3. Combinations of the previous rules.

IV. B3 Implied outcomes compared with match outcomes

To further examine the weak form efficiency of the betting market on Dutch football matches the accuracy of the predictions made by bookmakers regarding the outcome of matches is assessed. This is done by comparing the implied probabilities signalled by bookmakers through the odds they provide with ex post determined outcome probabilities. As discussed, the odds provided by bookmakers contain an overround. Therefore the probabilities for match outcomes derived from the odds are unfair. Hence, true implied probabilities are calculated following the methodology described in the previous sections. Next, implied probabilities for all outcomes are sorted and divided in 18 equally distributed groups13, each containing 187 events. Average corresponding outcome probabilities then are calculated by dividing the number of correctly predicted events by the total number of events per group. According to Kuypers (2000) implied probabilities should equal outcome probabilities in an efficient market. Furthermore, returns should be roughly equal across groups implying that a punter cannot gain from betting on a particular odds level. In a final stage the outcome probabilities are regressed on the implied probabilities using simple OLS estimation.

IV.C Semi strong form efficiency

A betting market is semi strong form efficient when publicly available data cannot be used to obtain better forecasts than those implied by bookmaker odds. In addition it should not be possible to create a profitable betting strategy using this information. Several studies discussed in the literature review have indicated that the odds provided by bookmakers do not reflect true probabilities of outcomes as they take bettor preferences into account. Using accurate model forecasts one should be able to exploit this behaviour. It is assumed that bookmakers do not possess any inside information. Regarding football matches such information could for example be inside knowledge about injuries of key players, or

13

(31)

31

tendencies of football players to intentionally lose matches. It is unlikely that such information would be available to bookmakers and not to the media.

To test whether football betting markets are semi strong form efficient, in general two methods are available: the indirect and the direct method. The indirect method estimates the number of goals scored by both teams and uses these numbers to predict match outcomes. The direct method estimates match outcomes directly. In this thesis both methods are employed as both have their advantages and disadvantages. Goddard (2005) argues that indirect models draw on a more extensive dataset, containing more information about the relative strength of both teams than the direct model. In addition they can be used for popular over/under bets which require an estimate of the total number of goals scored. Disadvantages of this model are that goals data might contain more noise and that the goals scored by both teams to some extent could be interdependent. Direct models do not face these problems and are more straightforwardly estimated. Both models in this thesis are estimated by including various variables containing publicly available data that is believed to influence match results. The variables are discussed in detail in section C3. The models are judged on statistical and economic performance to provide a judgement on the degree of semi strong form efficiency.

IV.C1 The indirect model

When focusing on the task to forecast goal scoring processes, a Poisson count regression is the obvious choice. The number of goals scored by home and away teams are variables that can take on only positive values resulting in a mean above null. Furthermore, many writers as for example Dixon and Coles (1997) and Cain et al. (2000) show that goal scoring processes are well approximated by the Poisson model. The Poisson regression model assumes that the dependent variable has a Poisson distribution and that the logarithm of its expected value can be modelled by a linear combination of unknown parameters through maximum likelihood estimation. In the Poisson model the probability distribution of the number of occurrences of an event is given by:

(32)

32

the Poisson distribution through the parameterization of the relation between the mean parameter W and the regressors x, using the exponential mean parameterization:

W = XY"Z, I = 1, … ,  (6) The conditional density of M, the number of occurrences (goals) given [, the regressors, is

PrM|[ = NOP"Q"

R"

S"! , lnW = [

_ (7)

The Poisson regression model is estimated by maximizing the following log-likelihood function:

`a_ = ∑ 

 {M[_ − XY"Z− ` M!} (8) IV.C2 The direct model

Football matches can result in three possible outcomes: a home win, draw and away win. To model these outcomes directly this thesis employs an ordered probit regression. This is done to account for the ordinal structure of match outcomes. If a strong team is playing at home against a weaker team the most likely match outcome is a home win, then a draw, and finally an away win. There thus is a natural order present in the dependant variable. This property is best captured by the ordered probit model.

The underlying ordered probit model is of the form: M,d= [

,d _ + f,d (9)

Where M,d∗ is a latent dependent variable providing the result of a match between teams I and g, [,d is a vector of explanatory variables, _ is a vector of parameters to be estimated and f,d is a standard normal, independent and identically distributed disturbance term. Observed is the outcome of the game:

Home win: M,d = 1 if h < M,d

(33)

33

h and h are threshold values that are estimated by the model as well. The predicted probabilities then are14:

PrM,d= 1 = 1 − Φh− _[,d

PrM,d = 0.5 = Φh− _[,d − Φh− _[,d (11) PrM,d= 0 = Φh− _[,d

Where Φ( ) is the standard normal cumulative distribution function. The coefficients (β’s) in the ordered probit model are easily interpreted due to the nature of the ordered classes. A positive sign implies that the variable increases the chance of a match resulting in a home win. A negative sign implies the opposite.

IV.C3 Explanatory variables

In this section the explanatory variables included in the models that are build to forecast goal scoring processes and match outcomes are defined and explained. In total eight explanatory variables are used: historical team quality, recent performance, elimination from the Dutch cup, elimination from European competitions, distance between home grounds, number of fans, significant incentive and shirt colour. Descriptive statistics for these variables are presented in table G at the end of this chapter.

IV.C3.1 Historical team quality

Stronger teams are more likely to win matches than weaker teams. Therefore it makes sense to capture this effect by including a variable indicating the long term success rate of the home and away playing teams. Different methods in the literature are employed to calculate this variable. Goddard and Asimakopoulos (2004) transform the match points gained by teams in the past two seasons and the current season where 1 point is awarded for a win 0.5 for a draw and 0 for a loss. They also use points gained by teams in lower competitions in previous seasons. Kuypers (2000) uses a more simplified estimator of historical team quality by simply inserting the cumulative difference between the match points gained by both teams in the current season. In this thesis the historical quality of a team is measured by summing the (up to the moment of the match) total number of points collected in the current season and the total number of points collected in the last two seasons. In this timeframe matches are still assumed to be informative of the current quality of a team while this period is extensive

14

(34)

34

enough to correct for the variance of short term results15. The variables that measure the effect of the historical quality of a team are named jklmj for the home team and jklmnd for the away team. The number of points collected by the teams in the seasons lagging season 2000-01 are collected from the official website of the Dutch Eredivisie: www.eredivisielive.nl/eredivisie.

A better historical performance of the home team should increase the chance of a match ending in a home victory whereas as a better historical performance of the away team should decrease the probability of a match ending in a home win. Therefore the signs of jklmj and jklmnd are expected to be positive and negative respectively in the direct model. The same applies to the estimation of home goals, for away goals the signs are expected to be reversed.

IV.C3.2 Recent performance

Many statements in the media regarding football are about the ‘shape’ of football teams, which refers to outcomes of recently played matches. Teams ‘in shape’ are expected to perform better in upcoming matches than teams ‘out of shape'. Kuypers (2000) includes the difference of the average number of points collect by both teams in the three previous matches. Points gained in away matches are weighted more than points gained in matches at home. Goddard and Asimakopoulos (2004) acknowledge that for home teams recent home outcomes are more informative of future match outcomes, while for away playing teams outcomes of away matches are more informative of future match outcomes. They use the most recent nine home outcomes and most recent four away outcomes. Bearing this in mind the variable indicating recent performance in this thesis is constructed as follows: for the home playing team the total points collected in the previous five home matches are summed, for the away playing team the points gained in the five most recent away matches are added. This results in the variables ljnj and ljnnd. The signs of both variables are expected to be the same as the sign of the historical team quality variables.

The values of ljnj and ljnnd obviously contribute to the variables measuring historical team quality and are therefore correlated. Table 2 in the appendix provides an overview of the correlation between all explanatory variables and reveals that in fact many of the variables are correlated significant to some extent. However, the correlation figures are deemed to be sufficiently low to refrain from dropping variables.

15

(35)

35 IV.C3.3 Elimination from the Dutch cup

In the Netherlands one other competition, the Dutch cup, is organized in which teams from the Eredivisie compete with each other as well as with teams from lower divisions. The Dutch cup is played in tournament form following a knock out system. A draw decides which teams play each other, the winner goes through to the next round. When matches are tied after ninety minutes two quarters of extra time are played. If the match is still undecided after extra time a penalty shoot out decides which team proceeds to the next round. Whereas games in the Eredivisie are played in the weekends matches played in the Dutch cup usually take place midweekly, resulting in an extra burden for players. It is often argued that teams that are still active in the cup are more fatigued than teams that are already knocked out. Furthermore, knocked out teams can focus completely on the regular competition and there is less chance of players getting injured during matches. On the other hand, making a deep run in the Dutch cup can boost the confidence of teams, resulting in a better performance in league matches. Previous research mostly finds that teams eliminated early in cups perform worse in future league matches (e.g. Goddard and Asimakopoulos (2004), Goddard (2005)).

The effect of elimination from the Dutch cup is measured by the dummy variables opqj and opqnd. These variables take on a value of one for teams that are still active in the Dutch cup and are zero for teams that are already knocked out of the tournament. Results of

matches in the Dutch cup are collected from the website

www.betexplorer.com/football/netherlands.

IV.C3.4 Elimination from European competitions

In Europe several international tournaments are organized during the football season including the UEFA Champions League and the UEFA Cup. The best performing teams in the Eredivisie are among the invitees to these competitions. Participating in these competitions is a lucrative business16. Likewise to matches in the Dutch cup these European matches take place midweekly, resulting in a more crowded agenda for participating teams. This might again lead to higher fatigue levels and more injured players. The effect of playing in European competitions is likely to be more significant than playing in the Dutch cup as opponents are often stronger, causing more intense matches. Furthermore the travel time is longer. Then again, teams that perform well in European competitions might gain boosted confidence levels. The effect of playing in European competitions is measured by the variables

16

(36)

36

qpqj and qpqnd for home and away teams respectively. The variables take on a value of one when a team is still active in a European competition. Teams that do not play in European context or are already knocked out are marked with a zero. Results from European competitions are collected from www.betexplorer.com.

IV.B3.5 Distance between home grounds

As mentioned before, the home ground advantage is a well documented phenomenon in the literature on sports betting markets. One of the explanations is the requirement for the away team to travel to the stadium of the home playing team, disturbing the match preparation process. Larger geographical distance between teams results in a stronger home ground advantage whereas the effect vanishes when local derbies are played (Pollard (2008)). Goddard and Asimakopoulos (2004) use the log of the total distance between two teams to account for travel effects. They find that the distance between home and away teams has a significant positive effect on the probability of a home team winning the match. In this thesis the effect of travel is measured by the distance between the cities of the home and away team resulting in the variable oklr,d. Distances are measured in kilometers and are collected online through Google Maps17. As a larger distance between a home and away playing should increase the probability of a home win the sign of this variable is expected to be positive for the direct model and the indirect model forecasting home goals. For the indirect model that estimates away scores the sign is expected to be negative.

IV.C3.6 Attendances

Teams with large fan bases and hence high attendance levels are more likely to win matches for a couple of reasons. A large crowd can influence matches directly by offering support to their team or by creating a 'hostile' environment for the visiting team. Furthermore, teams that attract a large number of visitors benefit indirectly from the revenues generated through those visitors. Teams with better financial positions are more able to retain and acquire player talent (Forrest et al. (2005)). Another reason to include a variable concerning attendance levels is that it can capture a possible sentiment or 'fan' effect which is discussed in detail in the literature review.

To inspect the effect of attendance levels on match outcomes the variables nrrj and nrrnd are constructed. nrrj is the average attendance level of the home team in the previous season (in thousands), while nrrnd is the average attendance level of the away team in the

17

Referenties

GERELATEERDE DOCUMENTEN

A betting exchange that charges up to the standard five percent commission, offers a limited number of sports, does not seem to be targeting any particular

TEACHING TOPICS · YEAR 2 · ISSUE 2 · NOVEMBER 2017 · TEAM-BASED LEARNING · UNIVERSITY OF TWENTE 4 In 2016, Professor Wieteke. de Kogel-Polak of Engineer- ing Technology implemen-

Combined with the seasonal decrease in breeding success and the lower recruitment probability of chicks hatched late in the season or fledged in poor condition, long- distance

For the team consisting of two or three robots, we will show that besides the desired formation shape (a line or a triangle), also incorrect shapes (a line with an erroneous distance

Niet alleen justitiële gedragsinterventies staan centraal in neurobiologisch onderzoek ge- richt op crimineel gedrag, maar er wordt ook meer onderzoek gedaan naar de voorspel-

In this thesis, frequency translation feedback loops employing passive mixers are explored as a means to relax the linearity requirements in a front-end receiver by providing

Landelijke Huisartsen Vereniging, ‘Notitie: Bewegingsruimte voor de huisartsenzorg, van marktwerking en concurrentie naar samenwerking en kwaliteit’, 26-05-2015, online via

Omdat betrokken partijen van tevoren niet altijd zullen weten of de koper een gelieerde partij is, is het aan te raden dat de beoogd curator zo snel mogelijk na zijn aanwijzing een