### Master’s Thesis

## The determinants of the transfer fees of professional football players

### Sander Bunschoten

### Student number: 12283401 Date: July 15, 2022 Master’s programme: Econometrics Specialisation: Econometrics

### Supervisor: Dr. J.C.M. van Ophem Second reader: Dr. J.C. Meier

### Faculty of Economics and Business

### Faculty of Economics and Business

### Amsterdam School of Economics

### Requirements thesis MSc in Econometrics.

1. The thesis should have the nature of a scientic paper. Consequently the thesis is divided up into a number of sections and contains references. An outline can be something like (this is an example for an empirical thesis, for a theoretical thesis have a look at a relevant paper from the literature):

(a) Front page (requirements see below)

(b) Statement of originality (compulsary, separate page) (c) Introduction

(d) Theoretical background (e) Model

(f) Data

(g) Empirical Analysis (h) Conclusions

(i) References (compulsary)

If preferred you can change the number and order of the sections (but the order you use should be logical) and the heading of the sections. You have a free choice how to list your references but be consistent. References in the text should contain the names of the authors and the year of publication. E.g. Heckman and McFadden (2013). In the case of three or more authors: list all names and year of publication in case of the

rst reference and use the rst name and et al and year of publication for the other references. Provide page numbers.

2. As a guideline, the thesis usually contains 25-40 pages using a normal page format. All that actually matters is that your supervisor agrees with your thesis.

3. The front page should contain:

(a) The logo of the UvA, a reference to the Amsterdam School of Economics and the Faculty as in the heading of this document. This combination is provided on Blackboard (in MSc Econometrics Theses & Presentations).

(b) The title of the thesis

(c) Your name and student number (d) Date of submission nal version

(e) MSc in Econometrics

(f) Your track of the MSc in Econometrics

Abstract

This thesis investigates the determinants of the transfer fees of professional football players. Linear regression is performed, where a Heckman correction is used to correct for sample selection bias. Only a small amount of football clubs disclose the exact transfer fee. The undisclosed transfer fees are collected from media reports and are expected to contain measurement errors. Additional regressions are performed to analyze whether the results obtained by using the data without measurement errors differ from the results obtained by using the full data set. The results suggest that the number of goals scored per 90 minutes by a player, the number of assists given per 90 minutes by the player, as well as the number of 90 minutes the player played, whether the player plays in a top five league, and the number of yellow cards the player received per 90 minutes are the most important determinants on the transfer fee of this player.

### Statement of Originality

This document is written by Sander Bunschoten who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

### Contents

1 Introduction 4

2 Theoretical background 5

3 Model and methodology 8

3.1 Model setup . . . 9 3.2 Estimation . . . 11

4 Data 12

5 Results 16

6 Conclusion and discussion 23

References 25

### 1 Introduction

Football is currently one of the biggest and most popular sports in the world. Starting as a game which was only played in a few countries, football has gained a lot of popularity worldwide over time and has now become the number one sport throughout the world (Nielsen Sports, 2018). As a result of the gaining popularity, the financial aspect became more important in the football industry. Professional football clubs noticed that money became essential to keep their top players and to sign players from other clubs. In 1973, FC Barcelona signed Johan Cruijff from Ajax for 1 million euros, which was at that time the biggest transfer in football ever. Since then, the transfer fees that clubs have been willing to spend have increased even more. In 1997, Internazionale broke the record by paying 26,5 million euros for Ronaldo Naz´ario. In 2017, the record was set to 222 million euros by Paris Saint-Germain for Neymar Jr.

The transfer fee that football clubs are willing to spend on a player is determined by many factors. Clubs want to value the impact a player is going to have on their success, which is hard to determine since no one knows for sure what this will be. However, research has shown that multiple measurable factors are likely to have an effect on transfer fees. These include characteristics of a player like the age of this player and performance statistics like the number of goals scored and number of assists given by the player. For clubs, it is interesting to know what these factors are to have an idea on which factors are important in the transfer market.

Many researchers who tried to find the determinants of transfer fees used linear models to explain the effect of multiple factors on the transfer fee. Many papers come across problems which had to be accounted for. For instance, some papers note that the analysis of football transfers is hampered by selection bias, due to not observing transfer fees for all players. It is pointed out that some players are more likely to be transferred than others, and therefore only observing transferred players leads to non-randomness of the data. To account for this, Heckman corrections were used by modelling the likelihood that a player is going to make a transfer.

Another issue which is noted in former research is the lack of transparency in the football industry. This leads to limited available data and therefore also hampers the analysis of transfer fees. For instance, exact transfer fees and information about the remaining contract length of players are hard to obtain. Furthermore, most research contain older data while the football industry is continuously growing. Nowadays, a lot of data about the performances and characteristics of players are available which were not available in the past. A possible effect of the coronavirus pandemic has also not been extensively investigated yet. Because football clubs lost out on revenues due to the pandemic (Financial Times, 2021), clubs may try to negotiate transfer fees to lower levels compared to before the pandemic. In this thesis, the most recent data will be used to estimate the effect that

different factors have on the transfer fee of players. Therefore, this thesis aims to answer the research question: Which factors determine the transfer fee of professional football players?

This question will be answered by constructing a multiple linear regression model, which has the transfer fee of a player as the dependent variable. Variables which cover the performances and characteristics of the player are used as explanatory variables. Examples of performance statistics are the number of goals scored, the number of assists given and the number of tackles made by a player. Characteristics that are used as explanatory variables include the age of a player and the remaining contract duration of the player. Problems that arise like selection bias will be investigated and corrected for, by using Heckman corrections.

This thesis uses a wide range of data on the performance statistics, characteristics and transfer fees of players. Data is obtained for the ten football leagues with the highest UEFA coefficients over five consecutive years. The remaining contract duration of the players are observed for all different years.

In this research, 11980 observations on players are used where 1523 observations correspond to a trans- ferred player. The data concerning the transfer fees of players are acquired from Transfermarkt.com.

The exact values of transfer fees are often not disclosed by the clubs involved, so Transfermarkt.com uses media reports to collect these transfer fees. Although the transfer fees on this site are often used by other researchers, it must therefore be noted that these transfer fees may contain measurement errors. Transfer fees that are surely accurate are the transfer fees disclosed by publicly traded football clubs. These clubs are obligated to publish the exact transfer fees, since these clubs are available at stock exchanges. Examples of publicly traded football clubs are Manchester United, Juventus, Dort- mund, Ajax, and Benfica. In the analysis, the fact that these transfer fees are more accurate will be made use of, by adding a regression which is only performed on data concerning publicly traded clubs.

Additionally, it is tested whether the the results obtained by only using officially published transfer fees and the results obtained by only using not officially published transfer fees are significantly different.

An overview of the existing research that has been done on this topic is listed in Section 2.

This section describes the results that have been found and the models and data that have been used to obtain these results. The model which is constructed in this paper and the used methodology are presented in Section 3. Section 4 captures the used data. The results of the research are listed in Section 5. Lastly, Section 6 presents the conclusions and discussion.

### 2 Theoretical background

To win a football match, it is essential to have players that are able to score goals and thus also to have players that are able to assist the goalscorers. Consequently, it has been shown by amongst others Majewski (2016) that the number of goals scored and the number of assists given are factors which

have a positive effect on the value of a forward. Majewski (2016) obtained these results by investigating several regression models and estimation methods. Since it was found that the relation between the dependent and the independent variables has a linear character by performing an LM-test, only linear models were used. In particular, these are ordinary least squares, generalized least squares and feasible generalized least squares with correction for heteroskedasticity, where the best results were obtained for the feasible generalized least squares method. In this paper, the (natural) logarithm of the value of the player is used as the dependent variable.

Different results were found by Ruijg and Van Ophem (2015), who included all positions in the analysis and concluded that age, the average number of minutes played and not being a goalkeeper are the most important factors that explain the transfer fee of a player. More specifically, the effect of age was found to be non-linear and concave, which was obtained by adding the squared value of age. The effect of the number of minutes player was found to be positive. In contrast to Majewski (2016), Ruijg and Van Ophem (2015) found that the number of goals scored did not have an effect on the value of the player. Although both papers made use of the linear relation between the value of the player and the explanatory variables, they differ in the use of models and estimation methods. Ruijg and Van Ophem (2015) noted that the analysis is hampered by selection bias and therefore proposed an estimation method that corrects for this, by introducing an ordered probit model on transfers which are either a decline or an improvement. In addition, a Heckman selection model was constructed to take account of the selection bias in the basic linear regression model. According to the paper, this is necessary, due to not observing all transfer fees and a considerable number of players being unable to find a new club.

The problem of sample selection bias has been noted in more papers. For instance, Carmichael, Forrest, and Simmons (1999) found that participation in the transfer market for football players is not random; some players are more likely to be transferred than others. It was concluded that sample selection bias is indeed present. Similar to Ruijg and Van Ophem (2015), Carmichael, Forrest, and Simmons (1999) constructed a linear regression model with Heckman correction to take account of this. More specifically, the Heckman two-step procedure was used for the estimation. From this, it was found that players who represent a higher transfer value are more likely to be transferred. The finding of Ruijg and Van Ophem (2015) that age has a non-linear and concave effect on the transfer fee has also been found by Carmichael, Forrest, and Simmons (1999). In addition to other research, interaction terms between the position of a player and other variables have been investigated. It was for instance found that the number of goals scored has a positive effect for both attackers and midfielders in comparison to the effect for goalkeepers. This effect was found to be higher for attackers than for midfielders.

The use of linear regression models is common in the analysis of transfer fees. For instance,

Poli, Besson, and Ravenel (2021) confirmed the finding of Majewski (2016) that the number of goals scored by a player and the number of assists given by this player have a positive effect on the value of a football player by using a multiple linear regression model. In this paper, a log-transformation of the dependent variable is used. In addition to most other research papers on this topic, the research considered the remaining duration of the contract of a player as an explanatory variable. It was found that the remaining duration of the contract and age are the most important variables which have an effect on the transfer fee. In particular, the effect of the remaining duration of the contract was found be positive. As opposed to earlier mentioned papers, Poli, Besson, and Ravenel (2021) included age but not the squared value of age as an explanatory variable in the analysis. The effect of age on the transfer fee was found to be negative.

A common issue in the analysis of transfer fees is the lack of available data, due to a lack of transparency in the football industry. For instance, in the paper by Ruijg and Van Ophem (2015), 55 transfer fees were observed from a total of 373 players, corresponding to one competition (the Premier League) from one season (2011-2012). The paper by Carmichael, Forrest, and Simmons (1999) also included one season (1993-1994) and one country (England), and used 2029 observations in the analysis, where 240 observations corresponded to a transferred player. This paper pointed out the possible importance of the remaining contract duration of a player on the transfer fee, but was not able to retrieve these data. This had also been emphasized by a paper by Dobson and Gerrard (1999), which stated that from the 1996-1997 season onward, the expiration date of a player’s contract was likely to have become a key determinant of a transfer fee, especially for players within 12 months of becoming free agents. This is due to the Bosman ruling, which allowed players in the EU to move to another club at the end of the player’s contract without a transfer fee being paid. The research by Poli, Besson, and Ravenel (2021), which included 2045 transfers from the top five competitions (Spanish La Liga, Italian Serie A, German Bundesliga, and French Ligue 1) between 2012 and 2021, did include information about contract duration. However, it is not clear from which source was consulted to obtain this information.

Another issue concerning the data collection is the collection of the transfer fees. Most recent research on this topic, for instance the research by Poli, Besson, and Ravenel (2021), uses Trans- fermarkt.com as source, but it is unclear what the accuracy of these data is. As Poli, Besson, and Ravenel (2021) already pointed out, most transfer fees are not officially published. Transfermarkt.com includes transfer fees which are collected from media reports. Therefore, most of the transfer fees on this website might not be fully accurate and therefore include measurement errors. The fees which are generally accurate are the fees from transfers where a publicly traded football club is involved, since these clubs are obligated to disclose the exact transfer fees. However, these span only a small amount of clubs.

Another possible effect on transfer fees to consider is the effect that the coronavirus pandemic had on the football industry. Because of the pandemic, which officially started in March 2020, European football clubs are on course to miss out on roughly 9 billion euros of revenues (Financial Times, 2021).

This had lead to declining transfer expenditure in European football. For instance, according to Deloitte (2021), the clubs from the English Premier League spent around 1.1 billion pounds in the 2021 summer transfer window, which is 11 percent lower than the spend during summer 2020 (1.3 billion pounds), while this was already a 9 percent drop compared to summer 2019 (1.4 billion pounds). This declining spending is also visible in the other top five leagues, except for the German Bundesliga. The declining spending indicates that clubs either buy less players or buy less expensive players. It may also indicate that due to the lack of incomes, transfer fees would have been negotiated to lower levels than in the past. Research by Poli, Besson, and Ravenel (2020) has shown that this was not the case for the summer transfer window of 2020. Since most papers on the determination of transfer fees use data from before the start of the pandemic, there has not been done a lot of extensive research on this topic yet.

To summarize, several papers on the determination of transfer fees of football players indicate that factors like the number of goals scored and the number of assists given by a player have a positive effect on the transfer fee of that player. In some research, it was found that age has a non-linear and concave effect on the transfer fee, while other papers showed a negative effect of age on the transfer fee. The latter was found when the squared value of age was not included as an explanatory variable.

Another possible effect which has been considered in recent research is the remaining contract duration of the player. This effect was found to be positive. A problem which was noted in some research papers is the problem of selection bias, which was in these papers corrected for by using a Heckman correction. In all described papers, linear models were used. A common issue in former research is the lack of available data, like information about the contract of a player. Also, often a small amount of leagues over a small amount of years are observed. Further, data on transfer fees are likely to contain measurement errors. All together, it can be expected that the number of goals scored, the number of assists given, and the remaining contract duration of a player have a positive effect on the transfer fee of the player. The effect of age on the transfer fee is expected to be either negative or non-linear and concave.

### 3 Model and methodology

To investigate the effect of different factors on the transfer fee of a player, a model has been constructed where a log-transformation of the transfer fee is used as the dependent variable and different factors such as age, the number of goals scored per 90 minutes, and the number of assists given per 90

minutes are used as explanatory variables. The setup of this model is covered in Section 3.1. After the construction of the model, multiple linear regression has been used to estimate the effects of the different factors on the transfer fee. To account for sample selection bias, a Heckman correction has been used. This is captured in Section 3.2. Since only a part of the observations on transfer fees are equal to the true transfer fee while the rest of the observations are expected to contain measurement errors, an analysis on this is added. This is also captured in Section 3.2.

### 3.1 Model setup

In the constructed model, the logarithm of the transfer fee of the transferred player has been used as the dependent variable. The logarithm has been used since this is a common transformation in former research. Since the transfer fees are all positive, the logarithm can be used without any loss of the data. As explanatory variables, different factors which are expected to have an effect on the transfer fee have been used. In former research on this topic, it was found that amongst others the number of goals scored and the number of assists given by a player, the age of the player and the remaining contract duration of this player have an effect on the transfer fee. Therefore, these factors have been used as explanatory variables. To capture a possible non-linear relationship between age and the transfer fee, the squared value of the age has also been included.

Furthermore, multiple dummy variables for the position a player plays in have been added to capture the role the player has on the field. More specifically, these are a dummy variable for being a forward, a dummy variable for being a midfielder, and a dummy variable for being a defender. Being a goalkeeper has been chosen as the reference category. Another dummy variable has been added which indicates whether the player played in a top five league in the season prior to the transfer.

The leagues which are included in the analysis are the ten leagues with the highest UEFA coefficients (UEFA, 2022). The five leagues with the highest UEFA coefficients are the English Premier League, the Spanish Primera Division, the Italian Serie A, the German Bundesliga, and the French Ligue 1, which are therefore treated as the top five leagues. Further, a dummy variable is used to capture a possible impact of the coronavirus pandemic on the transfer fees.

To capture the qualities of a player on the pitch, multiple performance statistics on the season prior to the transfer have been used. More specifically, these are the number of goals scored per 90 minutes, the number of assists given per 90 minutes, the number of interceptions made per 90 minutes, the number of tackles won per 90 minutes, the number of yellow cards received per 90 minutes, the number of minutes played per 90 minutes, and shot save percentage. The shot save percentage measures the shots on goal saved by the goalkeeper and are therefore only observed for goalkeepers.

The above statistics cover different aspects of football, so that the main jobs of forwards, midfielders,

defenders, and goalkeepers are all represented. Lastly, the number of minutes played has been added as an explanatory variable. This statistic has been divided by 90, so that it represents the number of 90 minutes played. An overview of all described statistics including short descriptions is shown in Table 1.

Table 1: Overview of all used statistics, including descriptions. All performance statistics are measured over the season prior to the transfer.

Statistic Description

Fee Transfer fee in euros

Age Age at time of transfer

Remaining contract Remaining contract duration at time of transfer in years 90s played Number of minutes played divided by 90

Before COVID-19 1 if transfer occurred before coronavirus pandemic, 0 otherwise Top 5 league player 1 when player plays in a top 5 league, 0 otherwise

Forward 1 if player is a forward, 0 otherwise Midfielder 1 if player is a midfielder, 0 otherwise Defender 1 if player is a defender, 0 otherwise Goalkeeper 1 if player is a goalkeeper, 0 otherwise Goals per 90 minutes Number of goals scored divided by 90 Assists per 90 minutes Number of assists given divided by 90 Interceptions per 90 minutes Number of interceptions made divided by 90 Tackles won per 90 minutes Number of tackles won divided by 90

Yellow cards per 90 minutes Number of yellow cards received divided by 90

Save percentage Percentage of shots on goal saved, only observed for goalkeepers

In addition to the already described explanatory variables, some interaction terms have been investigated to incorporate effects which differ per position. The interaction effects which have shown a significant effect are the number of assists per 90 minutes times the dummy for forwards, the number of yellow cards received per 90 minutes times the dummy for midfielders, and the number of tackles won per 90 minutes times the dummy for defenders. No other interaction terms have been used, since adding other interaction terms did not lead to significant results.

### 3.2 Estimation

After the construction of the model, multiple linear regression has been performed to obtain the effects of the different factors on the transfer fee. This has not only been done on the full data set, but also only on forwards, only on midfielders, only on defenders, and only on goalkeepers. For the regressions on a specific position, the dummy variables for the positions and the interaction terms have been removed.

Since sample selection bias is present in the regressions, as concluded by Carmichael, Forrest, and Simmons (1999), a Heckman correction is performed to account for this. The Heckman correction model treats the sample selection process as a form of omitted variable bias (Heckman, 1979). For this model, a selection equation has been added, which is used for a probit regression to capture the probability that a player will be transferred. Since the sample selection bias occurred because only the transfer fees of transferred players can be observed, the equation has a binary dependent variable which is 1 when the player made a transfer and 0 when the player did not make a transfer. Therefore, the probit regression estimates the probability that a player will be transferred.

As shown by Ruijg and Van Ophem (2015), age and the number of matches played are variables which have an effect of the probability that a player will be transferred. Therefore, these variables are added as explanatory variables in the selection equation. The number of matches played are measured over the season prior to the transfer and capture the number of matches a player featured in. This is thus different from the number of 90 minutes played, because a player can also play in a game for less than 90 minutes. The remaining contract duration of the player has also been added as an explanatory variable in this regression, since often clubs want to sell players before their contract expires. The model has been estimated by using the two-step method. This method produces similar results as the maximum likelihood method (Heckman, 1979).

It is expected that not all of the observed transfer fees are fully accurate, because of the fact that clubs often do not disclose the exact transfer fees. Therefore, some of the observations of the dependent variable suffer from measurement errors. Since the measurement errors occur in the dependent variable, the results will still be consistent. However, the variance of the results will be higher than in the case the exact transfer fees were observed. Clubs that do disclose the transfer fees are publicly traded clubs.

These clubs are obligated to disclose the exact transfer fees since the clubs are available on the stock market, and therefore these transfer fees can be treated as more reliable. The publicly traded clubs which are used in this analysis are Manchester United, Borussia Dortmund, Juventus, AS Roma, Lazio Roma, Olympique Lyon, SL Benfica, FC Porto, Sporting CP and AFC Ajax. Since the measurement errors are expected to be smaller for the observations which correspond to these clubs, the basic linear regression and the regression with Heckman correction have additionally been performed on only the data where the transfer fee has been disclosed by a publicly traded football club.

To investigate the difference between the results obtained when only using fully accurate obser- vations on transfer fees compared to only using observations that suffer from measurement errors, an additional regression is performed. The regression equation is as follows:

y_{i}= α + x^{′}_{i}β + γg_{i}+ (x_{i}g_{i})^{′}δ + ε_{i} (1)
where yiis the i^{th}observation of the dependent variable, which is the logarithm of the transfer fee, like
in the other models. xi is a vector which contains the i^{th} observations of the explanatory variables,
where the same explanatory variables are used as in the regressions on all players. Further, gi is a
dummy variable which is equal to 1 if the i^{th} observation of the transfer fee is fully accurate and 0
otherwise. The error term is denoted by ε_{i}. If the i^{th} observation of the transfer fee suffers from a
measurement error, so that the exact transfer fee is not disclosed by the involved clubs, g_{i} is equal to
0 and therefore the regression equation becomes

yi= α + x^{′}_{i}β + εi (2)

so that the vector of the coefficients of the explanatory variables is β. When the i^{th}observation of the
transfer fee does not suffer from a measurement error, so that the true transfer fee is disclosed by the
involved clubs, gi is equal to 1 and therefore the regression equation becomes

y_{i}= (α + γ) + x^{′}_{i}(β + δ) + ε_{i} (3)

so that the vector of coefficients of the explanatory variables is β + δ. Therefore, if there would be no differences between the coefficients of the explanatory variables in (2) and (3), δ would be a vector of zeros. Therefore, it is tested whether the entries of this vector are significantly different from zero.

Since for this, the data is divided in observations that suffer from measurement errors and observations
that do not, it is expected that the error term ε_{i} is not equal in (2) and (3). This would mean that
heteroskedasticity is present when performing the regression on (1). Therefore, White standard errors
are used, since these standard errors are heteroskedasticity-consistent.

### 4 Data

In this research, observations from ten football leagues over a period of five years have been used.

These years span the transfers for the season 2017-2018 until the transfers for the season 2021-2022.

Since performance statistics are observed over the season prior to the transfer, these statistics are observed over the season 2016-2017 until 2020-2021. The leagues that are included in the analysis are the ten leagues with the highest UEFA coefficient in 2022, namely the English Premier League, the Spanish La Liga, the Italian Serie A, the German Bundesliga, the French Ligue 1, the Portugese

Liga NOS, the Dutch Eredivisie, the Austrian Bundesliga, the Scottish Premiership and the Russian Premjer Liga.

The research has used data from several sources. First of all, Transfermarkt.com has been used to collect data related to the transfer fees of the transferred players. Since Transfermarkt.com does not contain information about the remaining contract duration of players at the moment of the transfer for every year, another source has been consulted to obtain this information. The computer game FIFA from EA Sports contains a game mode where it is possible to simulate a football career and therefore also to simulate future transfers, with a starting point based on the real world. Therefore, the remaining contract duration of players at the start of this simulation is based on the real remaining contract duration of a player at that time. Since EA Sports releases a FIFA game every year after the summer transfer window has closed, the data about remaining contract information have been obtained for different years. These data are summarized on FUTWIZ.com, which is therefore used to collect the data for all relevant years. Since EA Sports FIFA does not have licenses for all clubs in the Austrian Bundesliga, the Scottish Premiership, and the Russian Premjer Liga, the remaining contract duration has not been observed for all clubs in these leagues. For all other leagues, the remaining contract duration has been observed for almost all players.

To capture the characteristics like the age and the position of a player and performance statistics like the number of goals scored and number of assists given by this player, FBref.com has been used.

This site launched in June 2018 with league coverage for six nations, namely England, France, Spain, Italy, Germany, and the USA. Their database has been expanded over the years and now covers the leagues which are used in this research and more. Detailed information about players for different football seasons have been obtained from this site.

As explained earlier, the basic multiple linear regression model is likely to suffer from sample selection bias. Therefore, the sub-sample which includes the transferred players and the full sample which includes all players have both been used in the analysis. The full sample contains 11980 obser- vations, while the sub-sample contains 1523 observations. An overview of all used variables including descriptive statistics is shown in Table 2.

Table 2: Overview of the mean and standard deviation of all used statistics for the sub-sample, which only includes the transferred players, and for the full sample.

Sub-sample Full sample

Statistic Mean St. dev. Mean St. dev.

Fee 10.811 16.126

Age 25.495 3.392 26.795 4.054

Remaining contract 1.993 1.097 1.982 1.254

90s played 20.048 9.181 18.225 9.405

Matches featured 24.501 8.45 22.801 8.914

Before COVID-19 0.656 0.475 0.555 0.497

Top 5 league player 0.772 0.420 0.727 0.445

Forward 0.271 0.444 0.219 0.414

Midfielder 0.354 0.478 0.348 0.477

Defender 0.323 0.468 0.359 0.480

Goalkeeper 0.053 0.223 0.073 0.260

Goal per 90 minutes 0.161 0.196 0.125 0.179

Assists per 90 minutes 0.101 0.111 0.087 0.112 Interceptions per 90 minutes 0.918 0.748 0.945 0.761 Tackles won per 90 minutes 0.984 0.664 0.964 0.669 Yellow cards per 90 minutes 0.188 0.147 0.188 0.153

Save percentage 70.905 6.637 69.499 7.893

Number of observations 1,523 11,980

Table 2 shows that the full sample contains 11,980 observations. The sub-sample including transferred players contains 1,523 observations, which is 12.71 percent of the full sample. Further- more, the transferred players are on average more than one year younger than an average player from the full sample. Also, the playing time, the remaining contract length, the number of goals scored per 90 minutes, the number of assists given per 90 minutes, the number of tackles won per 90 minutes, and the save percentage are on average higher in the sub-sample compared to the full sample. Only the number of interceptions made is on average higher in the full sample compared to the sub-sample.

The differences are however not large. Further, the sub-sample contains more transfers which occurred before the start of the coronavirus pandemic than the full sample. The same holds for the number of transfers which concern players which played in a top five league in the season prior to the transfer.

The table shows that 77.2 percent of the observations in the sub-sample belong to players which played in a top five league in the season before the transfer, while for the full sample this is 72.7 percent.

Since in the analysis ten leagues are used, the percentage was expected to be around 50 percent.

The percentages turned out to be higher because of missing contract information from players from the Scottish Premiership, the Austrian Bundesliga, and the Russian Premjer Liga. Lastly, the sub- sample contains more forwards and defenders compared to the full sample. The average fee observed per league and the number of transfers per league are visualized in respectively Figure 1a and Figure 1b.

(a) Average fee per league (b) Number of observations per league Figure 1: Data visualization per league

Figure 1a shows that the average fee per league is highest for the English Premier League, fol- lowed by the other top five leagues. This indicates that players who played in a top five league the season prior to the transfer have on average a higher transfer fee than players who did not play in these leagues. Figure 1b shows the lack of observations from the Austrian Bundesliga, the Scottish Premiership and the Russian Premjer Liga. This is due to a lack of information about the contracts of players from these leagues, as explained earlier. Information about the remaining contract duration of a player is expected to be important. The average transfer fee per number of years a player has left on his contract is shown in Figure 2a. The distribution of the transfer fees is visualized in Figure 2b.

(a) Average fee per remaining contract duration (b) Distribution of transfer fees Figure 2: Data visualization for transfer fees

Figure 2a indicates that the transfer fee is on average higher for players with more years left on their contract, while this declines for players with a smaller amount of years left. This underlines the importance of information about the remaining contract duration of players. Figure 2b shows that most transfers in the sample are transfers which correspond to a transfer fee between 0 and 10 million euros. As the transfer fee increases, the number of observation decreases.

### 5 Results

The results of the basic multiple linear regression and the results of the regression with Heckman
correction are summarized in Table 3. Table 3a presents the main regression output, while Table
3b shows the results for the selection equation. Since the dependent variable is the logarithm of the
transfer fee, all coefficients can be transformed to obtain percentage changes. For instance, looking
at the results from the model with Heckman correction, an additional goal per 90 minutes makes the
transfer fee e^{1.725} = 5.61 times as large. This is equal to an increase of 461 percent. For dummy
variables, the interpretation is slightly different. In the model with Heckman correction, being a top
five league player gives an effect of 1.334, which means that the transfer fee is e^{1.334} = 3.80 times as
large as the fee of a similar player who does not play in a top five league. The interpretation for the
coefficients of interaction terms are again slightly different. For instance, in the model with Heckman
correction, every additional assist given per 90 minutes increases the fee by e1.351+1.509− 1 = 1646.2
percent for forwards, while this is e^{1.351}− 1 = 286.1 percent for all other positions.

Table 3a shows similar results for the basic linear regression as for the regression with Heckman correction. The table shows that the remaining contract duration of a player has a positive effect on the transfer fee of the player. More specifically, an additional year on a player’s contract increases the transfer fee by 33.9 percent (48.6 percent without Heckman correction). Playing in a top five league

Table 3a: OLS and Heckman two-step estimation results. The signs ’*’, ’**’, and ’***’ indicate significance of the coefficients at the 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.211^{∗} (0.094) 0.021 (0.160)

Age^{2} 0.002 (0.002) -0.004 (0.004)

Remaining contract 0.396^{∗∗∗} (0.026) 0.292^{∗∗∗} (0.060)
Top 5 league player 1.312^{∗∗∗} (0.069) 1.334^{∗∗∗} (0.070)

Before COVID-19 -0.019 (0.057) -0.028 (0.057)

90s played 0.034^{∗∗∗} (0.003) 0.049^{∗∗∗} (0.008)

Forward 4.284^{∗∗∗} (1.260) 4.435^{∗∗∗} (1.213)

Midfielder 4.315^{∗∗∗} (1.261) 4.428^{∗∗∗} (1.214)

Defender 4.999^{∗∗∗} (1.261) 5.065^{∗∗∗} (1.212)

Goals per 90 min 1.725^{∗∗∗} (0.187) 1.725^{∗∗∗} (0.185)
Assists per 90 min 1.279^{∗∗∗} (0.321) 1.351^{∗∗∗} (0.319)
Interceptions per 90 min -0.024 (0.052) -0.022 (0.052)
Tackles won per 90 min 0.086 (0.069) 0.090 (0.069)
Yellow cards per 90 min -0.832^{∗∗} (0.257) -0.805^{∗∗∗} (0.254)
Save percentage 0.069^{∗∗∗} (0.018) 0.070^{∗∗∗} (0.017)
FW*Assists per 90 min 1.617^{∗∗} (0.557) 1.509^{∗∗} (0.556)
DF*Tackles won per 90 min -0.249^{∗} (0.103) -0.254^{∗} (0.102)
MF*Yellow cards per 90 min 1.300^{∗∗∗} (0.382) 1.270^{∗∗∗} (0.377)

Constant -1.798 (1.730) -6.476^{∗} (3.014)

R^{2} 0.452 Correlation 0.865

Observations 1,523 11,980

Residual Std. Error 1.028 1.027

p-value F-test 0.000 0.000

Table 3b: Selection equation results. The signs ’*’, ’**’, and ’***’ indicate significance of the coefficients at the 5%, 1%, and 0.1% level, respectively.

Dependent variable: Transfer

Estimate St. error

Age 0.199^{∗∗∗} (0.047)

Age^{2} -0.005^{∗∗∗} (0.001)

Remaining contract -0.090^{∗∗∗} (0.013)
Matches featured 0.015^{∗∗∗} (0.002)
Constant -3.185^{∗∗∗} (0.643)

in the season prior to the transfer also has a positive effect on the transfer fee: being a top five league player increases the transfer fee by 279.6 percent (271.4 percent without correction). Playing an extra 90 minutes increases the transfer fee by 5.0 percent (3.6 percent without correction). The number of goals scored and the number of assists given per 90 minutes are also factors which have a positive effect. Every additional goal scored per 90 minutes increases the fee by 461.3 percent (same result without correction), while for every additional assist given per 90 minutes this is 286.1 percent (259.3 percent without correction).

A negative effect is found for every extra yellow card received per 90 minutes, namely a decrease of the transfer fee of 55.3 percent (56.4 percent without correction). For keepers, when the save percentage increases by one percent point, the transfer fee increases by 7.3 percent (7.1 percent without correction). No significant effects are found for the number of interceptions made per 90 minutes and the number of tackles made per 90 minutes. This also holds for the coronavirus pandemic. The results therefore suggest that transfer fees have not been negotiated to lower or higher levels for similar players than before the pandemic. Further, the position of a player is found to have a significant effect on the transfer fee. Being a forward, being a midfielder and being a defender are all found to have a positive effect on the fee compared to being a goalkeeper. This effect is highest for being a defender (158.4 times as large fee).

Significant effects are found for the interaction terms. Table 3a shows positive effects of the number of assists given per 90 minutes for forwards compared to other players. More specifically, an additional assist given per 90 minutes increases the fee by e1.351+1.509−1 = 1646.2 percent for forwards,

while this is e^{1.351}− 1 = 286.1 percent for all other positions. While an additional yellow card received
per 90 minutes decreases the transfer fee by 55.3 percent (56.4 percent without correction), the effect
is different for midfielders. For midfielders, an increase of the fee by 59.2 percent is found (59.7
percent without correction). Further, no significant effect is found for the number of tackles won per
90 minutes. However, for defenders, the effect is found to be negative: a decrease of 22.4 percent
per an extra tackle where the ball is won per 90 minutes (22.0 percent without correction). The only
difference in terms of significant effects between the basic linear regression model and the model with
Heckman correction is the effect of age. In the basic linear regression model, it is found that age has
a negative effect on the transfer fee. More specifically, becoming one year older decreases the transfer
fee by 19.0 percent. This effect is found to be linear, since no significant effect is found for the squared
value of the age. In the model with Heckman correction, no significant effect of age is found.

The correlation between the basic linear regression model and the model with Heckman correc- tion is 0.865, which implies that there is a positive correlation between the probability that a player is transferred and the transfer fee of that player. The results from the probit regression which be- longs to the Heckman selection equation are shown in Table 3b. The table shows a non-linear and concave effect of age on the probability that a player will be transferred, implying that the probability is increasing for relatively young players and decreasing for relatively old players. The effect of the remaining contract duration is found to be negative. More specifically, an additional year on a player’s contract decreases the probability that the player will be transferred by 9.0 percent points. Featuring in one extra match increases the probability of the player being transferred by 1.5 percent points.

The results obtained by only using the observations which contain officially published transfer fees are presented in Table 4. Table 4 shows less significant effects than Table 3, most likely because the results from Table 4 are obtained by using less observations. Table 4a shows that in the model with Heckman correction, being a top five league player makes the transfer fee 2.8 times as large (2.6 without Heckman correction) compared to not being a top five league player. Every additional 90 minutes played increases the fee by 6.2 percent (3.0 percent without correction). Every extra goal scored per 90 minutes increases the fee by 271.7 percent (273.2 percent without correction), while every extra assist given per 90 minutes increases the fee by 472.0 percent (417.1 percent without correction).

An additional yellow card received per 90 minutes decreases the fee by 83.2 percent (83.5 percent without correction).

For the basic linear regression model, it is found that an additional year on a player’s contract increases the player’s transfer fee by 30.2 percent. For the model with Heckman correction, no sig- nificant effect is found. The results of the model with Heckman correction show significantly positive results for being a forward, being a midfielder, and being a defender compared to being a goalkeeper.

These effects are not found to be significant for the basic linear regression model. The same holds

Table 4a: OLS and Heckman two-step estimation results for the regression which only contains publicly traded clubs. The signs ’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.109 (0.228) 0.698 (0.753)

Age^{2} 0.001 (0.005) -0.019 (0.017)

Remaining contract 0.264^{∗∗∗} (0.060) -0.028 (0.253)
Top 5 league player 0.973^{∗∗∗} (0.160) 1.013^{∗∗∗} (0.161)
Before COVID-19 -0.268^{⋄} (0.152) -0.320^{∗} (0.150)

90s played 0.030^{∗∗∗} (0.008) 0.060^{∗∗} (0.021)

Forward 2.299 (2.444) 3.030^{∗∗} (0.951)

Midfielder 2.003 (2.468) 2.617^{∗∗} (0.960)

Defender 2.675 (2.453) 3.238^{∗∗∗} (0.931)

Goals per 90 min 1.317^{∗∗} (0.472) 1.313^{∗∗} (0.438)
Assists per 90 min 1.643^{∗} (0.708) 1.744^{∗} (0.695)
Interceptions per 90 min 0.068 (0.142) 0.074 (0.135)
Tackles won per 90 min -0.002 (0.177) 0.019 (0.174)
Yellow cards per 90 min -1.802^{∗∗} (0.667) -1.781^{∗∗} (0.606)

Save percentage 0.028 (0.034) 0.035^{∗∗} (0.014)

FW*Assists per 90 min 0.618 (1.354) 0.370 (1.304)
DF*Tackles won per 90 min -0.182 (0.238) -0.207 (0.238)
MF*Yellow cards per 90 min 1.746^{⋄} (0.993) 1.809^{⋄} (0.941)

Constant 0.393 (3.466) -13.442 (11.873)

R^{2} 0.369 Correlation 1.113

Observations 245 977

Residual Std. Error 1.000 0.993

p-value F-test 0.000 0.000

Table 4b: Selection equation results for the regression which only contains publicly traded clubs.

The signs ’*’, ’**’, and ’***’ indicate significance of the coefficients at the 5%, 1%, and 0.1% level, respectively.

Dependent variable: Transfer

Estimate St. error

Age 0.258^{∗} (0.123)

Age^{2} -0.006^{∗∗} (0.002)

Remaining contract -0.100^{∗} (0.039)
Matches featured 0.012^{∗} (0.005)

Constant -3.350^{∗} (1.647)

for the save percentage. For the model with Heckman correction, it is found that an increase of one percent point of the save percentage increases the transfer fee of the player by 3.6 percent.

In both models, a negative effect of the coronavirus pandemic on the transfer fees is found.

Since the used dummy variable is 1 when the transfer took place before the start of the pandemic, this
means that transfer fees have been negotiated to higher levels for similar players before the pandemic
compared to since the pandemic. For the basic linear regression model, it is found that for midfielders,
an additional yellow card received per 90 minutes increases the transfer fee by 2.8 percent. For the
model with Heckman correction, this effect is found to be negative (-5.4 percent). Further, the results
from both regressions do not show a significant effect of age. The R^{2} of the basic linear regression
in 0.369, which is lower than the R^{2} of 0.452 found when using the full data set. The estimated
correlation coefficient is 1.113, which is unexpected. Table 4b shows the the results from the probit
regression which belongs to the selection equation of the model with Heckman correction. Like Table
3b, the effect of age is found to be non-linear and concave. Further, an additional year on a player’s
contract decreases the probability that the player will be transferred by 10.0 percent points. Featuring
in one extra match increases the probability of the player being transferred by 1.2 percent points. All
effect are less significant than the effects found in Table 3b.

A basic linear regression and a regression with Heckman correction are also performed on re- spectively only forwards, only midfielders, only defenders, and only goalkeepers. The full data set has been used for this, so not only the observations concerning publicly traded football clubs. The results

of the regressions performed only on forwards are summarized in Table 5. Table 5a presents the main regression output, while Table 5b shows the results for the selection equation. The results for only midfielders, only defenders, and only goalkeepers are summarized in a similar way in Table 6, Table 7, and Table 8, respectively. These tables are, together with Table 5, added to the thesis as an appendix.

Being a top five league player has a positive effect on the transfer fee for all different positions. The number of goals scored per 90 minutes has a positive effect on the transfer fee for all outfield players (forwards, midfielders, and defenders). The same holds for the number of assists given per 90 minutes.

However, the number of assists per 90 minutes only shows a significant effect for defenders when using Heckman correction and not without the correction. Significantly positive effects are also found for the number of 90 minutes played for all outfield players, where a positive effect is also found for goalkeepers in the basic linear regression model, while no significant effect is found with Heckman correction.

The number of interceptions made per 90 minutes shows a significantly positive effect for for- wards when using Heckman correction. In all other cases, no significant effect is found. The age of a player, as well as the coronavirus pandemic do not show a significant effect in all models for all positions. The remaining contract duration shows a positive effect on the transfer fee for all positions in the basic linear regression model. This effect is also found for midfielders when using Heckman cor- rection, while no significant effects are found with the correction for the other positions. The number of yellow cards received per 90 minutes shows a negative effect for forwards and defenders, while no significant effects are found for midfielders and goalkeepers. For goalkeepers, the save percentage has a positive effect on the transfer fee.

For forwards, midfielders, and goalkeepers, the age of a player has a non-linear and concave effect on the probability of making a transfer. For defenders no significant effect of the age of the player is found. For all positions, the effect of the remaining contract duration is negative. The effect of the number of matches a player featured in is found to be positive for all positions, except for goalkeepers.

The number of saves per 90 minutes does also not show a significant effect for goalkeepers. The R^{2}is
highest for the regression for forwards (0.502) and lowest for the regression for midfielders (0.410). The
correlation coefficient is 0.954 in the regression for forwards and 0.316 in the regression for midfielders.

The correlation is higher than 1 in the regressions for defenders and in the regression for goalkeepers, which is unexpected.

In all regressions presented, the results indicate that playing in a top five league and the number
of 90 minutes played have a positive effect on the transfer fee, as well as the number of goals scored per
90 minutes and often also the number of assists given per 90 minutes. The only negative effect that
is found in almost all models is the number of yellow cards received per 90 minutes. The regression
using only the observations from publicly traded clubs shows not as many significant results as the
regression using all observations. Also, this regression has the lowest R^{2} of all performed regressions

(0.369). The highest R^{2} is found for the regression on forwards (0.502). Further, the correlation
between the probability that a player will be transferred and the transfer fee is found to be positive.

However, in some models, the estimated correlation coefficient is higher than 1, which is unexpected.

The effect of age on the probability that a player will be transferred is found to be non-linear and concave, while the effect of the remaining contract duration is found to be negative. However, this effect is positive for the number of matches a player featured in.

The results for the regression which tests for differences between the results when only using officially published transfer fees and only using not officially published transfer fees show some signifi- cant results. More specifically, a difference is found for being a top five league player (-0.517 without Heckman correction and -0.523 with correction, 1% significance level). A significant difference is also found for the effect of the coronavirus pandemic (-0.269 without correction and -0.290 with correction) and the number of yellow cards per 90 minutes (-1.153 without correction and -1.173 with correc- tion), both with a 10% significance level. For all other variables, no differences which are significantly different from zero are found.

### 6 Conclusion and discussion

In the regressions presented, it is often found that playing in a top five league, the number of 90 minutes played, the number of goals scored per 90 minutes, and the number of assists given per 90 minutes in the season prior to the transfer all have a positive effect on the transfer fee. The effect of the number of yellow cards received per 90 minutes is often found to be negative. These findings are in line with the literature. The remaining contract duration shows a positive effect on the transfer fee in most regressions. This finding is also in line with the literature. The used interaction terms show less expected effects. The positive effect of the number of yellow cards received per 90 minutes for midfielders and the negative effect of the number of tackles won per 90 minutes for defenders are not as expected. These effects are only found when using all the data. The last interaction effect which was found is the positive effect of the number of assists per 90 minutes for attackers, which is a more logical result. Position seems to have an effect on the transfer fee, since the included dummy variables show significant results. Being an outfield player is found to have a positive effect on the transfer fee compared to being a goalkeeper. The effects per position differ, but also have some similarities. For instance, being a top five league player and the number of goals scored per 90 minutes show a positive effect on the transfer fee for all positions.

When the full data set is used, the results from the basic linear regression model and the model with Heckman correction do not differ a lot. The only difference is that in the basic linear regression model, age is found to have a significantly negative and linear effect while in the model with Heckman

correction, no significant effect is found. When using only the data concerning publicly traded clubs, less variables are found to have a significant effect. Age is for instance not found to have a significant effect, which is not as expected. However, as opposed to the regressions using the full data set, a significant effect is found of the coronavirus pandemic on the transfer fee. This effect is found to be negative, implying that transfer fees have been negotiated to higher levels for the same player before the pandemic compared to since the pandemic. Together with the fact that no significant effect was found using the full data set, this implies that there is no statistical evidence found that transfer fees have been negotiated to lower levels for the same player before the pandemic compared to since the pandemic, which is in line with the literature. Since there are only ten publicly traded clubs which are used in the analysis, this data set is substantially smaller than the data set containing all clubs. This is most likely the reason that less significant effects have been found. However, the data set still has the advantage that the measurement errors are expected to be smaller than the errors from the data set which contains all clubs.

The positive correlation between the probability that a player is transferred and the transfer fee is found in all models and is as expected and in line with the literature. However, in some models, the estimated correlation coefficient is higher than 1. This is unexpected, since correlation coefficients generally takes value between 0 and 1. Therefore, the specification of the models may not be fully accurate. Further, age is found to have a non-linear and concave effect on the probability that a player is transferred. The effect of the remaining contract duration is found to be negative, while the effect of the number of matches played is found to be positive. All of these effects are as expected.

The results for the regression which tests for differences between the results when only using officially published transfer fees and only using not officially published transfer fees have shown some significant results. A difference has been found for being a top five league player with a 1% significance level, while a significant difference for the effect of the coronavirus pandemic and the number of yellow cards per 90 minutes have been found both with a 10% significance level. For all other variables, no differences which are significantly different from zero have been found. Since only one difference is found to be relatively highly significant, it seems like the difference between the results are small.

In conclusion, the finding that playing in a top five league, the number of 90 minutes played, the number of goals scored per 90 minutes, and the number of assists given per 90 minutes have a positive effect on the transfer fee is confirmed in all models and is as expected. In all models, the effect of the number of yellow cards received per 90 minutes is found to be negative, which is also as expected. The position of a player is found to have an effect on the transfer fee, where being an outfield player has a positive effect on the fee compared to being a goalkeeper. The regression performed on all positions separately show some minor differences. The most significant effects are found when using the full data set, but this data set has the disadvantage of measurement errors due to the fact that not

all clubs disclose the official transfer fee. The regressions which are performed on data from publicly traded clubs have the advantage of smaller measurement errors, but the disadvantage of containing less data. A regression which tested for differences between the results when only using officially published transfer fees and only using not officially published transfer fees showed that the differences are small.

This indicates that the transfer fees on Transfermarkt.com are quite reliable.

Although the transfer fees concerning publicly traded clubs have been disclosed and were there-
fore treated as having no measurement errors, this thesis has not accounted for different structures in
transfer fees. For instance, transfer agreements could include performance related bonuses which are
not disclosed. Further, for future research, more observations could be obtained by using more leagues
over more years. Also, since all results show R^{2}values between 0.369 and 0.502, different explanatory
variables could be investigated to improve the specification of the models. Further, future research
could investigate more extensively the effects of the probability of a player making a transfer. This
thesis uses multiple variables, but more possible variables could be explored. Lastly, no free transfers
are included in this thesis. Therefore, another form of selection bias may be present. This should be
explored in future research.

### References

Carmichael, F., Forrest, D., & Simmons, R. (1999). The labour market in association football: who gets transferred and for how much? Bulletin of Economic Research, 51 (2), 125–150.

Deloitte. (2021, Sep). Premier league clubs spend£1.1 billion in summer transfer window but overall spending falls for second consecutive year. Retrieved from https://www2.deloitte.com/

uk/en/pages/press-releases/articles/premier-league-clubs-spend-1-1-billion-in -summer.html

Dobson, S., & Gerrard, B. (1999). The determination of player transfer fees in english professional soccer. Journal of Sport Management , 13 (4), 259–279.

FBref. (2022). Football statistics and history. Retrieved from https://fbref.com/en/

Financial Times. (2021, May). Europe’s football clubs face ’new reality’ after €9bn covid hit.

Financial Times. Retrieved from https://www.ft.com/content/a1ef328e-5e2c-4d8f-9361 -cc50f46cdeaa

FUTWIZ. (2022). Fifa career mode. Retrieved from https://www.futwiz.com/en/fifa22/career -mode/players

Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica, 47 (1), 153. doi:

10.2307/1912352

Majewski, S. (2016). Identification of factors determining market value of the most valuable football players. Central European Management Journal , 24 (3), 91–104.

Nielsen Sports. (2018). World Football Report. Retrieved from https://www.nielsen.com/wp -content/uploads/sites/3/2019/04/world-football-report-2018.pdf

Poli, R., Besson, R., & Ravenel, L. (2020, Oct). The real impact of covid on the football players’

transfer market. Retrieved from https://football-observatory.com/IMG/pdf/mr58en.pdf Poli, R., Besson, R., & Ravenel, L. (2021). Econometric approach to assessing the transfer fees and

values of professional football players. Economies, 10 (1), 4.

Ruijg, J., & Van Ophem, H. (2015). Determinants of football transfers. Applied Economics Letters, 22 (1), 12–19.

Transfermarkt. (2022). Football transfers, rumours, market values and news. Retrieved from https://

www.transfermarkt.com/

UEFA. (2022). Country coefficients: Uefa coefficients. Retrieved from https://www.uefa.com/

nationalassociations/uefarankings/country/#/yr/2022

Table 5a: OLS and Heckman two-step estimation results for the regression which only contains forwards. The signs ’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.231 (0.177) 0.169 (0.315)

Age^{2} 0.003 (0.004) -0.007 (0.007)

Remaining contract 0.393^{∗∗∗} (0.050) 0.181 (0.134)
Top 5 league player 1.305^{∗∗∗} (0.130) 1.336^{∗∗∗} (0.130)

Before COVID-19 0.056 (0.103) 0.075 (0.102)

90s played 0.034^{∗∗∗} (0.006) 0.053^{∗∗∗} (0.012)

Goals per 90 min 1.602^{∗∗∗} (0.244) 1.629^{∗∗∗} (0.238)
Assists per 90 min 2.899^{∗∗∗} (0.445) 2.870^{∗∗∗} (0.435)
Interceptions per 90 min 0.296 (0.180) 0.350^{∗} (0.178)
Tackles won per 90 min -0.329^{∗} (0.142) -0.324^{∗} (0.139)
Yellow cards per 90 min -0.774^{∗} (0.380) -0.742^{∗} (0.366)

Constant 2.917 (2.362) -3.954 (4.877)

R^{2} 0.502 Correlation 0.954

Observations 412 2,627

Residual Std. Error 0.964 0.968

p-value F-test 0.000 0.000

Table 5b: Selection equation results for the regression which only contains forwards. The signs ’⋄’,

’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: Transfer

Estimate St. error

Age 0.284^{∗∗} (0.096)

Age^{2} -0.007^{∗∗∗} (0.002)

Remaining contract -0.151^{∗∗∗} (0.027)
Matches featured 0.018^{∗∗∗} (0.004)
Constant -4.155^{∗∗} (1.300)

Table 6a: OLS and Heckman two-step estimation results for the regression which only contains midfielders. The signs ’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.219 (0.183) -0.1230 (0.3304)

Age^{2} 0.002 (0.004) 0.000 (0.007)

Remaining contract 0.382^{∗∗∗} (0.044) 0.363^{∗∗∗} (0.072)
Top 5 league player 1.317^{∗∗∗} (0.119) 1.325^{∗∗∗} (0.119)

Before COVID-19 -0.038 (0.099) -0.043 (0.099)

90s played 0.038^{∗∗∗} (0.005) 0.042^{∗∗} (0.013)

Goals per 90 min 1.676^{∗∗∗} (0.330) 1.681^{∗∗∗} (0.326)
Assists per 90 min 1.336^{∗∗∗} (0.400) 1.350^{∗∗∗} (0.396)
Interceptions per 90 min -0.059 (0.086) -0.058 (0.085)
Tackles won per 90 min 0.187^{∗} (0.090) 0.186^{∗} (0.089)
Yellow cards per 90 min 0.406 (0.303) 0.404 (0.300)

Constant 2.471 (2.439) 0.800 (5.402)

R^{2} 0.410 Correlation 0.316

Observations 539 4,175

Residual Std. Error 1.043 1.043

p-value F-test 0.000 0.000

Table 6b: Selection equation results for the regression which only contains midfielders. The signs

’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: Transfer

Estimate St. error

Age 0.365^{∗∗∗} (0.089)

Age^{2} -0.008^{∗∗∗} (0.002)

Remaining contract -0.070^{∗∗} (0.023)
Matches featured 0.017^{∗∗∗} (0.003)
Constant -5.574^{∗∗∗} (1.200)

Table 7a: OLS and Heckman two-step estimation results for the regression which only contains defenders. The signs ’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.031 (0.155) 0.056 (0.281)

Age^{2} -0.002 (0.003) -0.008 (0.007)

Remaining contract 0.433^{∗∗∗} (0.045) 0.219 (0.176)
Top 5 league player 1.285^{∗∗∗} (0.124) 1.309^{∗∗∗} (0.127)

Before COVID-19 -0.008 (0.103) -0.043 (0.106)

90s played 0.032^{∗∗∗} (0.005) 0.062^{∗∗} (0.021)

Goals per 90 min 2.803^{∗∗∗} (0.740) 2.838^{∗∗∗} (0.740)
Assists per 90 min 0.977 (0.604) 1.132^{⋄} (0.608)
Interceptions per 90 min -0.056 (0.073) -0.062 (0.070)
Tackles won per 90 min -0.114 (0.101) -0.101 (0.102)
Yellow cards per 90 min -0.813^{∗} (0.352) -0.781^{∗} (0.339)

Constant 0.987 (2.076) -3.709 (5.159)

R^{2} 0.476 Correlation 1.059

Observations 492 4,306

Residual Std. Error 1.024 1.022

p-value F-test 0.000 0.000

Table 7b: Selection equation results for the regression which only contains defenders. The signs ’⋄’,

’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: Transfer

Estimate St. error

Age 0.013 (0.078)

Age^{2} -0.002 (0.002)

Remaining contract -0.073^{∗∗} (0.023)
Matches featured 0.011^{∗∗∗} (0.003)

Constant -0.542 (1.070)

Table 8a: OLS and Heckman two-step estimation results for the regression which only contains goalkeepers. The signs ’⋄’, ’∗’, ’∗∗’, and ’∗ ∗ ∗’ indicate significance of the coefficients at the 10%, 5%, 1%, and 0.1% level, respectively.

Dependent variable: log(Fee)

OLS Heckman

Estimate St. error Estimate St. error

Age -0.615 (0.444) 1.561 (3.987)

Age^{2} 0.012 (0.008) -0.034 (0.082)

Remaining contract 0.303^{∗∗} (0.113) -0.275 (1.089)
Top 5 league player 1.498^{∗∗∗} (0.334) 1.598^{∗∗∗} (0.335)

Before COVID-19 -0.388 (0.271) -0.411 (0.285)

90s played 0.023^{⋄} (0.013) 0.109 (0.154)

Assists per 90 min 6.661 (16.831) 2.048 (11.537)
Save percentage 0.083^{∗∗∗} (0.020) 0.087^{∗∗∗} (0.009)
Yellow cards per 90 min 1.689 (2.272) 1.647 (1.742)

Constant 2.029 (6.363) -38.976 (73.246)

R^{2} 0.441 Correlation 1.088

Observations 80 872

Residual Std. Error 1.043 1.022

p-value F-test 0.000 0.000