Strategy analysis and rank prediction in 2000m rowing tournaments

(1)

MSc Artificial Intelligence

Master Thesis

Strategy analysis and rank prediction in

2000m rowing tournaments

by

Celeste Kettler

10418954

September 11, 2017

36 EC January 2017-August 2017 Supervisor: Dr M.W. van Someren Assessor: Dr B.J.A. Kr¨ose

(2)

Abstract

Data of 20 tournaments from the period 2013 until 2016 is used in the analysis of strategies as performed in 2000m rowing races and their relation to the outcome of the race, where strategy is defined as being the different approaches by which effort can be divided throughout the race. Strategies as considered in this thesis consist of four continuous normalized gradients of each 500m part of the race, where the average stroke rate of the regarded boat in this race is used as normal-ization factor and the gradient is a measure of difference between the stroke rate at first measured point and the last measured point of the 500m. This is the first time such a large amount of data is used in an exploratory research of strategies, generating new insights on the relation between strategies and other race dependent features. The data shows that more than one strategy is ap-plied in 2000m rowing races, either having a U-shape or an L-shape in the stroke rate over the distance. One version of the U-shaped strategies proves to be the strategy that occurs most often and not the often proposed even strategy, where the velocity is kept as constant as possible. The four gradients do not seem to have a strong relation and therefore they are regarded separately to see the influence of each feature on the strategy shape in each of the four race prats. The features that are known prior to the race, and therefore can be used in the planning of the strategy, and have an influence on the strategy are boat size, round and the number of races after the current race on the same day. Using these features a prediction is made of the strategy, testing the performance of K-Nearest Neighbors, Random Forest and Multi-Layer Perceptron Regression. This shows that the selected features are most influential on the first and last 500m of the race, and shows that crews probably take these features into account when deciding for a strategy. Since the main goal of a crew is to win the race, i.e. reach one of the highest three ranks, the relation between strategy and rank is researched by their correlation, comparison of strategies used within the different ranks and the value of strategies in the prediction of ranks. The first two are performed on the separate gradients while the last represents the influence of the combination of the four gradients. When comparing the strategies within the different ranks, the significant differences are mostly found in the middle part of the race. Low correlation values are found between the strategy and rank, which increase when looking per feature configuration, which is a combination of values of the boat size, rounds and number of races after the current race. A thorough research of Ordinal Logistic Re-gression, K-Nearest Neighbor ReRe-gression, Random Forest Regression and Multi-Layer Perceptron Regression is performed to generate a good prediction model. There are only three cases where the strategy has some predictive value, but in all other cases the average rank is a closer match to the achieved rank then the rank predicted by the models. Therefore the strategy as defined in this thesis proves not to be influential on the rank a team achieves.

(3)

1 Introduction

In rowing, the difference between winning and losing can be a matter of milliseconds. For example in the A-Final of the Light Male double scull at the Rio 2016 Summer Olympics, the gold medalist arrived in 6:30:70, not too far ahead of the silver and bronze medalists who arrived within one second after. Even fourth place was only 2.59 seconds later than the gold medalist, which is equal to ₁₅₁1 th of the time they are racing. This means that every improvement, even the slightest, is important. One of these improvements can be made in the pacing strategy domain, where strategy is defined as any of the possible approaches by which effort can be divided throughout the race. This should be done in such a way that the final time is optimal for a team. The articles of Foster [4] and Fukuba [9] show that a change can be made by adapting a suiting strategy, mentioning that the use of a wrong strategy can cause a team to start their finishing effort too late. At the same time the energy of the rowers should be divided in such a way that they do not collapse before the end of the race [4]. Another relevant point that Fukuba highlights is the ability of athletes to compensate for the time lost when they are starting at their fatigue threshold, the threshold after which it is no longer possible for the athlete to stay at the required power output [22]. Therefore Foster et al recommend the athletes to try out a variation of strategies while training for big events, to feel what the best strategy is for a team, thereby implying that there is no ’one size fits all’ solution. However, they, and many other authors, suggest that it is beneficial to row an even race, because the water resistance increases four times when the velocity is doubled [3][11][16][24]. In practice this appears to not be the most used strategy, since most rowers seem to adopt a strategy with a powerful start, decreasing in velocity until the 1500 meter point and finishing in a sprint [10].

The choice of strategy may be dependent on several factors that are known before the race starts, this thesis will contain research on their existence and the scale of the dependence relation. Muehlbauer et al have done some research on the influence of boat type, round and rank on the strategy, based on split times at the 2008 Olympic games[23]. They notice a difference in the split times recorded in heats and finals, indicating that the round may be of influence on the chosen strategy. To support a crew and coach in the decision for a strategy, insights should be obtained on the factors that influence the strategy and the effect that a strategy has on the outcome of a race. This leads to the main question of this thesis: Is there a relation between the chosen strategy and the rank that is achieved? In other words: are there strategies that must be avoided or applied to win the race?

The questions that will be answered to reach a conclusion on the main question are listed per section below:

Section 2

What is the definition of a strategy?

Section 3

What are the characteristics of a rowing tournament that may be related to the strategy?

Section 4

What relevant features are present in the data provided by the FISA? What data represents the strategy?

Are there features with a possible relation to strategies that are not directly present in the data, but can be composed with it?

Section 6

(6)

And is it possible to predict the strategy based on these factors?

Section 7

What is the relation between the achieved rank and the used strategy? Is the rank predictable using solely the strategy?

2 Definition of a strategy

Prior to each race there is a plan sketching the course of the race for one boat which uses the stroke rate (number of strokes per minute) and numbers of strokes to define what needs to happen at which traveled distance in the race [21]. The stroke rates are used because they are adjustable by the rowers and have influence on the speed, and thereby their final race time [12][15]. To distribute the energy of the rowers in such a way that the result is most favorable, rowers apply different stroke rates in different phases of the 2000m race. The adjustments of these stroke rates say a lot about the choice a crew makes about the distribution of their power, which makes them an indication of the strategy used by a crew, and therefore useful when analyzing strategies.

3 Short introduction in rowing tournaments

To be able to understand all different features and their influence on the strategy, a short introduction in the field is necessary. First the structure of a race tournament will be explained, followed by some aspects that influence the speed of a boat.

3.1 Tournament outline

Round Preceding round(s) Following round(s)

Heats None Repechange, Quarter-Finals,

Semi-Finals or Finals

Exhibitions None A-Final

Repechanges Heats Quarter-Finals, Semi-Finals,

Finals Quarter-Finals Heats and Repechanges Semi-Finals Semi-Finals Heats and Repechanges Finals

or Quarter-Finals

Finals Heats and Repechanges, None

an Exhibition or Semi-Finals

Table 1: Rounds, and their preceding and following rounds in a tournament

A rowing tournament consists of different fields, one field per boat type. The boat type depends on: the number of people in a boat, the number of oars per rower, the presence of a coxswain and the sex and weight of the people rowing the boat. Extra information on the boat types and their presence in the data can be found in appendix B.

The number of entries affects the number of rounds a team needs to row to reach the finals. There are different rounds in a tournament, which are described in table 1 in relation to the rounds preceding them and the rounds following them [6]. To give an example of one of the longer trajectories: one can start in the heats, go through to the repechange, reach the AB semi-finals, end in the top half of the field and compete in the A final for the first place. The shortest trajectory consists of an Exhibition

(7)

and an A-Final. Exhibitions, as shown in table 1 are only followed by the A-final and entail that there are the same amount or fewer boats than lanes in that particular race. The only reason to do this round is to give everybody a specific lane in the finals, where the boats that are fastest deserve the inner lanes or the lanes that have an advantage. This advantage is caused by the direction and force of the wind, which leads to currents and vortexes in the water who are equally present in each lane [25][29]. The exhibition may give the least information, since each boat will get to the A-final and the lane distribution is only important when a strong wind is present [13] [29]. The repechange is a round that is only known in the field of rowing and creates a second chance for the boats who did not reach the next round directly. This means that the boats who row in the repechange have one extra race to row compared to the boats who are directly through to the next round. Strategies can vary across rounds [16] for saving energy for other rounds, or going at a team’s maximum to be sure of a position in the highest next round or to skip the repechanges. Whether there is a difference in the strategies used in these different rounds is something that will be examined later on in this thesis.

Next to the boats and the rounds, there are some other characteristics of a tournament that may influence the outcome. Every round consists of 6 boats or less, depending on the number of entries. One knows several days prior to the first round, which other teams will be in the same race. After the first round, the results from the rounds determine the placing of the boats, and therefore the opponents of the next round are known when the results are in. This is usually known between a few hours and a day prior to the next race. Different teams enroll for each tournament, which means that some tournaments may be harder to compete in then others, making the tournament type an interesting feature. Also, the number of races per day can vary. This depends on the number of rounds a boat needs to get through to reach the final and whether they spread the rounds over several days or not. All these aspects need to be represented in features to be able to determine their importance for the race outcome.

4 Data

Data on rowing competitions will be used to answer the questions on the factors that influence the strategy and the relation between strategy and the rank. This data has been made publicly available by the FISA [25] in the form of PDF files. The files containing results and GPS data, of which the latter includes speeds and stroke rates per 50m, from the tournaments World Cup 1, World Cup 2, World Cup 3, European Championships and World Championships in the 4 year period from 2013 till 2016 serve as the data source. All relevant information needs to be extracted from these files, as described in section 4.1, and are transformed to features that either describe the strategy, possibly influence the strategy or contain information on the achieved rank.

4.1 Data extraction

Since the data obtained from the site of the FISA is contained in PDF files, a conversion method is created for the extraction of the data. If there is a wish to replicate this approach, appendix A can be read for more details on the implementation. Because there are two types of files (GPS and Results), two methods are implemented for data extraction, combining all information in one dataframe, a table where each row is an instance and each column is a feature, that is suitable for feature construction and data analysis.

4.2 Content

When both files are combined in one dataframe, it contains 5461 individual boat races. This number originates from the merge, where only the boat races with complete information are maintained and the ones missing either the Results or the GPS data are disregarded. This has been done for it is

(8)

necessary to have all features present in all instances; no research can be done on absent information. The information available for each individual boat racing is displayed in table 2. B gives a short clarification on the different features in this content. features from table 2 are already present in the data and no extra processing steps need to be taken to get them. Tables 30, 32, 31 and 28 in appendix B contain the number of occurrences of different boat types, rounds, tournaments and teams. The dataframe includes 18 boat types, with an average of 303 occurrences per boat type, where the M8+ has the least occurrences, 13, and LM2x has the most, 677. There are 5 different rounds, of which the heats occur most often with a total of 1677 times, and the Quarter Finals with 287 occur the least often. Some spread is found in the number of occurrences within teams, which can be seen in Fig. 30 in the appendix. Where some teams appear 36 times; the Light Weight Women scull of Poland, and other teams appear only one time, like the Women four scull from the USA. In section 4.3 the features containing the speed and stroke rate per 50m are described and cleaned and are used for determining the strategy. The date itself is not used as a feature, but features are generated from the date in section 4.5. The start lane, the presence of a coxswain, the tournament, the boat type, the round, the year, the country, the sex and the team name are known prior to the race, which makes them relevant for answering the question: Which factors are important in the choice for a strategy? The ranks are used for the analysis of the relation between rank and strategy in section 7.

Content Interval or amount

Team name (Country + team number) 1 per boat

Names of team members depends on boat type

Date (DD-MM-YY) 1 per file

Boat type 1 per file

Tournament 1 per file

Round 1 per file

Presence of a coxswain 1 per file

Start lane 1 per boat

Speed per boat in m/s every 50m

Stroke rate per boat in spm every 50m

Time elapsed from start every 500m

Time elapsed since last benchmark every 500m

Rank every 500m

Rank of time elapsed since last benchmark every 500m Table 2: content of the files provided by the FISA

4.3 Smoothing the stroke rate and speeds data

To check the viability of the speeds, in m/s, and stroke rates, in strokes per minute (spm), from the GPS files, graphs are visualized. These show several kinds of irregularities; resolution irregularities, where the graph fluctuates between two values because the actual value is in the middle but can not be represented, and outlier irregularities, where there is a large difference between one value compared to the rest of the values of that race. To determine the origin of the irregularities, video footage of 180 boats are checked. These confirm that most of them are faults, and that no irregularity was intended as a strategy. Graphs illustrating these irregularities can be seen in appendix C. These inconsistencies are smoothed using two smoothing methods using averaging over different window sizes. More on these approaches can be found in appendix D.

4.4 Representation of strategy in the data

As previously defined in section 2, the adjustments in stroke rate represent the choice of power distribu-tion by the crew. These adjustments in stroke rate can be found in the GPS stroke rate measurements,

(9)

making them the best representation of the rowers’ strategy present in the data. They are captured by the difference in stroke rate between two distances, here called gradient, for they mark the magnitude of the slope of the stroke rate graph for a certain section of the 2000m rowing race. For example, if a crew has a stroke rate of 45 at the 50m point and one of 35 at the 500m point, the gradient will be 45 − 35 = 10 for these 450m. Because these measurements are from the race, they capture other effects apart from the race plan as well. Among these effects are tiredness, giving up and changes in stroke rate used to overtake another crew. In analysis these effects will not be overlooked as a possible cause of the height of the stroke rate, but the precise determination of the size of the influences of these effects on the race is outside the scope of this study.

(a) (b)

Figure 1: stroke rate of 25 randomly selected boats over the course of the race (a) and the gradient of the selected intervals (b)

Race plans are often divided in four parts: 50-500m, 500-1000m, 1000-1500m and 1500-2000m [21], since the 500m points are marked making them the only reference a rower has for the traveled distance. The stroke rate is measured for the first time at the 50m point, where a peak in stroke rate is detected for each crew. This is related to the acceleration of a team at the start of the race, where the height of this peak is representative of the effort that a crew puts in getting up to speed. The division per 500m has the advantage that small fluctuations that are not part of a strategy are not regarded, making the gradient determination more reliable than when a smaller window is used. A larger window size would generalize the trend too much, especially because the changes in the value of the gradient are most obvious at the 500m point. Next to that, the analysis using 500m parts will contribute when a coach is making his race plan, for a rower does not have other reference points for distance travelled, it is not useful to generate a race plan based on smaller distances. Since the data contains GPS measurements of the stroke rate at each 50m point, we can determine the strategy based on these values. When these values are visualized in graphs, one can see that the 500m points often mark a change in the gradient of the stroke rate as visible in Fig. 1. Therefore the strategy is a combination of values of four gradients, one for each 500m part.

(10)

Figure 2: Distribution of the frequency of occurrence of the effect size of one stroke

(a) (b)

Figure 3: The 10 most used strategies (a) and the most used strategy (b), in normalized stroke rate over distance

The effect of the stroke rate on the speed is not always the same, which can be seen by looking at the distribution in Fig. 2. This figure shows the number of occurrences of each speed/stroke rate proportion (meter per stroke), where the speed is represented in m/s and the stroke rate in strokes per second. There are many causes of this difference, examples are: oar length, water current, water temperature, wind and a crew’s power output. Further research on the contribution to this difference by these factors is beyond the scope of this thesis. For the goal is to see which strategies lead to win the race it is necessary to enable reasoning generally, using different races. Therefore the gradient is normalized by dividing it by the average stroke rate of that boat in that race. This levels the crews that perform the same strategy at different stroke rate, and makes it possible to compare strategies independently of the stroke rate height. The effect of a crew’s power output will remain present and influencing on the rank a crew achieves. But this research uses a large number of different crews performing the same strategies. If a strategy leads to a certain rank in the majority of the cases, every crew rowing this strategy has a larger probability of reaching that rank. Whether the ability to row certain strategies is dependent of the power output of a team remains for further research. When visualizing the strategy that equals a certain combination of the four gradients, I use a plot of the normalized stroke rate, with the gradient representing the slope of the four different 500m parts. This creates gives a clearer image what happens over the 2000m then a plot of the gradients like 1 (b). The continuous representation of the strategies results in 5424 unique strategies for 5461 individual races when the full precision of the normalized values is maintained. To find out if different strategies are used and which type of strategies are used most often, the strategies are grouped into one decimal

(11)

(a) 50-500m (µ = −0.24, σ = 0.09) (b) 500-1000m (µ = −0.04, σ = 0.03)

(c) 1000-1500m (µ =< 0.001, σ = 0.04) (d) 1500-2000m (µ = 0.04, σ = 0.07)

Figure 4: Distribution of the four gradients

precision gradients, meaning that gradient -0.23 is part of the group -0.2. Figure 3 shows 10 discreet strategies that account for 62% of the data. The two shapes that are represented in the 10 most occurring discreet strategies are the U-shape and the L-shape. The most used strategy is a U-shaped strategy in which rowers steeply decline till the 500m point, decline less steeply in the second 500m, increase their stroke rate slightly in the third 500m and increasing heavily in the final 500m. This strategy appears 592 times in the data, which is 11% of all instances. This is almost in line with the findings of Garland though he found a decline in the third 500m as well. Since a lot information gets lost when discretizing, the gradients are kept at full precision for the rest of the thesis.

When looking at Fig. 3 and the histograms in Fig. 4, the first and the last gradient have a higher variance than the 2 gradients representing the middle part. This translates to variation in the use of a higher or lower effort in either the start or the finish, which indicates that more than one strategy is used in these tournaments. The most used strategy is, opposed to the proposed even one, a U-shaped strategy. This is in line with the findings of Muehlbauer and Garland, however they found a slight decrease in both stroke rate and time in the third part while this data shows a small increase in stroke rate.

4.5 Feature construction

Apart from the strategy some features need extra processing, for they are not directly represented in the data but are characteristic of a 2000m race. The full list of created features can be viewed in appendix F. Not all features are used for this research, but their presence in the dataframe may be of importance for further research. The constructed features that are used in this thesis are described in the next sections.

(12)

4.5.1 Date based features

Two features are created from the date; whether the same team has another race after the current race on the same day or whether they had one before this race. The first may indicate a change in strategy to save energy for the next race, and the second could predict a tiredness, also resulting in a different strategy. This feature is created per team and no check has been performed whether a particular person from the team had races before or will have races afterwards. As it is highly irregular to have a person perform in different boats this is not considered as a possible influence on the strategy [13].

4.5.2 Team based features

To determine characteristics of teams, some features are created using all races of that team in the data. A team in this case is a certain combination of boat type and country name, and, if available, the crew number, which is a suffix of a number concatenated to the country name. This suffix is only available if several crews from one country participate in the same boat type. This definition of team means that different configurations of crew members and change of crew members are not registered as being a new team, for this would leave me with too little occurrences of each team. The average rank of a team is added to the features, indicating how well that team performs, which may have influence on the strategy they choose. For example, if a team always comes in first, they may have more self-esteem and dare to row certain strategies.

4.5.3 Opponent based features

Since a team knows beforehand who they are going to race against, features can be generated containing characteristics from the opposing teams. Considering that a race has, at most, 6 boats racing against each other, the number of opposing boats is maximally 5, meaning that 5 team averages are added as a opponent features, which are used for the calculation of the average rank of all opponents. This feature can be of influence when a crew rows the heats and knows that all opponents usually perform worse than them. This can lead to a strategy in which their effort is at the merest necessary to make it to the next round, for example not sprinting at the end of the race.

5 Mutual correlation strategy gradients

5.1 Method

To see whether each of the gradients can be analyzed separately, without considering the other gra-dients, the mutual correlation between the gradients is computed. It is expected that the gradients are highly correlated, for someone who starts really fast may not have the energy to sprint at the end and vice versa. Since the values of the gradients are continuous, the correlation is calculated with the Pearson correlation, found in equation 1, where x and y are the two compared features, in this case; two gradients representing two different parts of the race. ¯x is the mean of x and ¯y is the mean of y. A ρ of 0 means that there is no correlation, a ρ of 1 or -1 means that there is a strong relation present, either positive or negative.

ρ = Pn i=1(xi− ¯x)(yi− ¯y) pPn i=1(xi− ¯x)2p(yi− ¯y)2 (1) 5.2 Results

The results can be found in table 3 and tables 4, showing low correlations between the gradients. All but two are significant, the non-significant comparisons are: the correlation between the gradient

(13)

at 50-500m and 1000-1500m and the correlation between 50-500m and 1500-2000m. The significant correlations have low coefficients, of which the highest correlation is a positive value of 0.113 between the 3rd and the last 500m, meaning that the values of both gradients sometimes change in the same direction. This means that all kinds of gradient combinations appear, with limitation to the values of the gradients that have been seen in this study. What these values are can be seen in Fig. 4 in section 4.4. The biggest proportion of the data has a negative gradient at the beginning and a close to zero gradient in the middle two parts and most often a final gradient around 0. So all kinds of deviations on this strategy are possible, and the gradients can be analyzed independent of each other.

Correlation 50-500m 500-1000m 1000-1500m 1500-2000m

50-500m - -0.072 -0.007 -0.014

500-1000m -0.072 - 0.036 0.047

1000-1500m -0.007 0.036 - 0.161

1500-2000m -0.014 0.047 0.161

-Table 3: Correlation between the four strategy gradients

Correlation p-value 50-500m 500-1000m 1000-1500m 1500-2000m

50-500m - <0.001 0.605 0.301

500-1000m <0.001 - 0.008 0.001

1000-1500m 0.605 0.008 - <0.001

1500-2000m 0.301 0.001 <0.001

-Table 4: Correlation between the four strategy gradients

6 Relation features to strategy

6.1 Method

To see whether some features (Round/Boat size/..) are influencing the strategy that is used, the correlation is calculated between the strategy and each one of these features. The results of this correlation are used to decide whether a more thorough study is necessary on the relation between strategy and the feature. Since the strategy consists of four gradients, where each gradient represents a different 500m part of the race, we can split the analysis into four parts. This is done because the correlation between the parts of the strategy is low, as can be seen in table 3, meaning that many different combinations of gradients exist. This means that correlations are calculated per 500m part, to see which features influence which parts of the race. If this correlation is significant, the differences in the four gradients between the various groups within that feature are measured using statistical t-tests. An example is a comparison of the 50-500m gradients rowed in the heats and the ones rowed in the finals, which are both part of the feature: Round. The features which are ascertained to be influential will be used as features in a model determining the strategy. If a correlation is found, but the t-tests do not show significant differences for the gradients, the feature is not used in prediction. It would indicate a confounding variable, since the distribution of the gradients within the different values of feature itself does not explain the correlation. To generate the best possible fit for prediction, several regression methods are compared.

6.2 Correlations

First a Spearman correlation test is performed between the feature and each part of the strategy, to determine whether the nature of the relation should be examined more thoroughly with t-tests. The features boat size, presence of coxswain, average rank opponents, weight class, # races after on same

(14)

day, sex and year are ordinal or binary, making the Spearman correlation the best coefficient for their comparison to the strategy gradients. Equation 2 shows how this correlation is calculated, where n is the number of instances and d is the difference between two ordinal categories.

ρ = 1 −6P d 2

n3_{− n} (2)

In the case of the three categorical variables; rounds, tournament and team, it is not possible to determine the relation using correlation as a measure. Their relation to the strategy gradients needs to be determined with t-tests, which is done in section 6.3.

To determine the relevance of each non-categorical feature a correlation test is done separately for each part of the strategy. If a feature has a correlation of more then 0.1, positive or negative, the feature will be used in predictions and the nature of the correlation is researched. Correlations beneath 0.1 are too weak to be influential. A value 0.1 is still very weak, but since there are only 6 correlation co¨efficients that hold a value above this threshold, too little information will remain when the threshold is increased. The tables in Fig. 5 show the 3 highest correlation co¨efficients for each part. The features; boat size and # races after current race on the same day, have a correlation, of which boat size is the only one having a correlation to the gradients of each race part.

Feature Correlation Boat size 0.334 Presence of coxswain 0.090 Weight class 0.080 (a) 50-500m Feature Correlation

# races after on day -0.112

Boat size 0.093

Sex 0.084

(b) 500-1000m

Feature Correlation

Boat size 0.117

Year -0.078

(c) 1000-1500m

Feature Correlation

Boat size 0.106

Average rank opponents 0.058

(d) 1500-2000m

Figure 5: Correlation between the non-categorical features and the strategy gradients. The top 3 is shown per gradient.

6.3 T-test comparisons

To determine the nature of the relationship between the features and the strategy, t-tests are performed on the categorical features and the features with a correlation to the strategy. When comparing two different groups, for example men and women, all 50-500m points of the men and all 50-500m points of the women are grouped. These two groups are compared to each other to determine whether there is much difference in the strategy between the two groups and which part of the race shows the largest differences. All different groups of each category are pairwise compared, this gives the opportunity to see the influence of each group compared tot the others. More general approaches like one-versus-all or an average difference of all pairwise group differences are not able to show at what part of the strategy or in which two groups specifically the differences are the most distinctive. Strategy differences might be lost when the average is pulled towards the 0 by all parts that are non-discriminatory. To be able to make a fair comparison, the groups are sampled without replacement, using the size of the smallest group as sample size. The histograms are based on the combination of all samples, as well as the results from the t-test, making these samples the size of the smallest group. The next sections show the results of the two-sample unequal variance t-tests [27] between each two groups of one feature for every part of the race separately. The significance level is a p-value of 0.05. The following sections contain the results that are most relevant for this study. A more complete view of all generated results is given in appendix G.

(15)

6.3.1 Rounds

Since rounds is a categorical variable, it is researched using pairwise t-tests per gradient on each round type pair. As described in section 3 there are several rounds that rowers need to get through to get to the Final. Because rowers want to win the tournament, they want to have as much power left for the finals, while making sure they are in one by performing well enough in previous rounds. Muehlbauer et al determine a difference in finishing times between the heats and the finals [23], indicating a difference in approach between these rounds. To determine whether there is a difference in the strategies used in each round, and thereby whether the round has predictive power, four strategy gradients are compared for every possible combination of 2 round types. The letters of the rounds are used in the tables and the figures. The H represents the heats, the R represents the repechange, S represents the semi-finals and F the finals.

Compared rounds Part t-score p-value

F and H 1500-2000m 11.049 <0.001 H and S 1500-2000m -5.642 <0.001 H and R 1500-2000m -5.511 <0.001 F and R 1500-2000m 5.402 <0.001 F and S 1500-2000m 5.305 <0.001 ... ... ... ... F and R 1000-1500m -0.411 0.681 H and S 50-500m 0.221 0.825 R and S 50-500m 0.131 0.896 R and S 1500-2000m -0.115 0.909 H and R 50-500m 0.098 0.922

Table 5: Round comparison most and least significant results

(a) H vs R 50-500m (b) F vs H 1500-2000m

(16)

(a) Heats (b) Finals

Figure 7: The average strategies and the 95% confidence interval of the two most significantly different rounds

The results of the most and least similar rounds as determined by the two sample t-test on two samples of 286 individual boat races are in table 5, showing the highest number of significant values in comparisons to the heats. Since the teams are distributed over the heats according to world rank, there is a high discrepancy in the level between the teams. This creates a higher possibility that a team is far ahead of the other teams and is able to maintain their position at a lower stroke rate, explaining the shape of the average strategy in Fig 7. The most similar rounds are the Repechanges and the Semi-Finals, showing only one compelling difference in the 500-1000m part of the race. The final 500m is significantly distinct for each round pair, from declining in the heats to inclining in the finals, contrary to the first 500m where no significance is obtained. The round show enough variation to make it a valuable asset in the prediction of strategies.

6.3.2 Tournament

Compared tournament Part t-score p-value

ECH and WCH 1000-1500m -3.356 0.001 WC2 and WCH 1000-1500m -2.444 0.015 WC3 and WCH 1500-2000m -2.226 0.027 WC2 and WCH 50-500m -1.988 0.048 ... ... ... ... WC1 and WC3 50-500m 0.288 0.773 ECH and WCH 500-1000m -0.086 0.932 WC1 and WC3 1000-1500m 0.055 0.956 WC1 and WC2 1500-2000m -0.043 0.966 ECH and WC3 50-500m -0.041 0.967

Table 6: Tournament comparison results

Tournament is the second categorical feature that needs to be examined with t-tests. ECH stands for European Championships, WC means World Cup and WCH is the World Championships. The results from the t-test show only only four significant differences, all in comparisons to the World Championships. The two most different gradient distributions are the ones between the European Championships and the World Championships and World Cup 2 and the World Championships on the 1000-1500m part, and show slightly higher gradients in the World Championships then in both the other tournaments. But alongside these two values, there are eight comparisons in which no difference can be determined for the same race part. For the high number of comparisons where no difference can be proved, the tournament will not be used in the prediction of the strategy. The tournament is not influencing the strategy as defined here.

(17)

(a) ECH vs WCH 1000-1500m (b) WC2 vs WCH 1000-1500m

Figure 8: The two significant tournament gradient comparisons on the 1000-1500m part of the race

6.3.3 Team

The third and final categorical feature is team, and the t-test is used to see whether teams hold on to the same strategy and if different strategies are used among teams. There are many teams that performed in only one or two tournaments and therefore have less then 15 occurrences in the data, as one can see in Fig. 9, where group size refers to the group of all occurrences of one team; if a team rowed 3 races the group size is 3. To get the most information out of the t-test, the threshold on participation in the paired comparison is a group size of 15. All teams from one boat category are compared to each other, to eliminate the influence of the boat size. When there are less than 2 teams with at least 15 occurrences, the boat type is not considered.

Figure 9: Histogram on the number of occurrences of all teams

(a) IRL vs NOR LM2X (b) ITA vs FIN W2X

(18)

Compared countries Boat type Part t-score p-value

GBR and BUL LM1x 50-500m -3.176 0.003

DEN and POR LM1x 500-1000m -0.003 0.998

GBR and GER LM2- 50-500m -5.714 <0.001

GBR and JPN LM2- 1500-2000m -0.034 0.973

IRL and NOR LM2x 500-1000m 12.016 <0.001

HUN and DEN LM2x 1000-1500m 0.001 0.999

DEN and AUT LM4- 50-500m -9.232 <0.001

CAN and CZE LM4- 1500-2000m -0.016 0.987

DEN and NZL LW1x 50-500m -6.252 <0.001

IRL and AUT LW1x 1500-2000m -0.023 0.982

CAN and RUS LW2x 50-500m -6.212 <0.001

JPN and ARG LW2x 1500-2000m 0.009 0.993

BLR and DEN W1x 50-500m 6.873 <0.001

BLR and DEN W1x 1000-1500m 1.57 0.13

ITA and FIN W2x 50-500m -11.119 <0.001

GER and DEN W2x 1000-1500m -0.009 0.993

GER and NZL W4x 1000-1500m -5.346 <0.001

GER and GBR W4x 1000-1500m 0.001 0.999

Table 7: The highest and lowest significance of comparison of countries per boat type. The combination of a country and a boat type is a team.

The most and least different gradients of teams per boat type are listed in 7. Each category shows both significant and non-significant values, and is therefore suitable for prediction. That some teams consistently have a different approach is also visible in 10.

6.3.4 Boat size

Compared boat sizes Part t-score p-value

1 and 2 50-500m -2.064 0.042 1 and 4 50-500m -4.88 <0.001 1 and 8 50-500m -7.943 <0.001 1 and 8 1500-2000m -4.106 <0.001 2 and 4 50-500m -3.308 0.001 2 and 8 50-500m -7.007 <0.001 2 and 8 1500-2000m -2.003 0.048 4 and 8 50-500m -3.529 0.001 4 and 8 1500-2000m -2.087 0.04

Table 8: Significant boat size comparison results

Compared boat sizes Part t-score

1 and 2 500-1000m 0.298

1 and 4 500-1000m 0.222

1 and 8 1000-1500m -0.362

2 and 4 500-1000m -0.088

2 and 4 1500-2000m 0.18

Table 9: Most similar boat size gradient distributions

The first feature whose relation to the strategy is researched for its correlation to the strategy is boat size. There are 4 boat sizes in rowing; 1, 2, 4 and 8. When only one person is in a boat, more

(19)

decisions can be made while racing. The rowers can feel the amount of energy they have left and may change their stroke rates accordingly. When rowing in an eight, rowers must follow the commands of the coxswain; even though one person may have some energy left, he/she cannot decide to go faster as seven other people may have a different feeling. Next to the ability to make quick decisions, an eight-manned boat has a much larger mass than a smaller boat, and this means that increasing and decreasing speed takes longer. As well as a larger power output, which could mean that a lower stroke pace is sufficient for reaching a high speed. Therefore the expectation is to see a less fluctuating strategy in larger boats, and a higher standard deviation and thereby more variation in the smaller boats. The two-sample t-tests are performed on a sample of 560 for each comparison.

11.

(a) 2 vs 4 in 500-1000m (b) 1 vs 8 in 50-500m

Figure 11: The histograms of the least significant (Left) and the most significant (Right) boat size comparisons

(a) 1 (b) 8

Figure 12: The average strategies and the 95% confidence interval of the two most significantly different boat sizes

Table 8 contains all significant results and contains 9 of the 24 comparisons. Every comparison of two boat sizes has a significant difference in the first 500m of the race. Table 9 shows comparisons in which the least difference is detected. The second 500m of the boat sizes 2 and 4 are the most similar, which is highlighted by the even spread of values in the histogram in Fig. 11. The comparison between boat size 1 and boat size 8 is the least different in the first 500m, which is underlined by both the average strategy as seen in Fig. 12 and the distributions shown in Fig. 11. The one-manned boats average strategy starts at a stroke rate that is 1.3 times their average stroke rate and decrease to their average stroke rate in the first 500m, while on average the eight-manned boats start at a stroke rate that is only 1.15 times their average stroke rate. The most significant comparison when considering all parts of the race is the one between boat sizes 1 and 8. The distributions of boat size 1 contain more negative gradients in the first and last 500m than size 8, visible in appendix G in Fig. 40. This means

(20)

that a boat with one rower often decrease their stroke rate more drastically after the start sprint. The found differences show that the use of boat size as a feature would add value to the prediction of the strategy. And creating a different model in the prediction of the rank using the strategy for each boat type can be beneficial. For more details see appendix G.

6.3.5 Number of races after current race on same day

Compared # of races after Part t-score p-value

0 and 1 50-500m 2.188 0.029

0 and 1 500-1000m 5.961 <0.001

0 and 1 1000-1500m 8.968 <0.001

0 and 1 1500-2000m 8.609 <0.001

Table 10: # races after current race comparison results

(a) 0 vs 1 in 50-500m (b) 0 vs 1 in 1000-1500m

Figure 13: The histograms of the least significant (Left) and the most significant (Right) number of races after current race comparisons

An other feature with a correlation to the strategy is the number of races after the current race on the same day. If a crew knows that they have to do an other race, in a round closer to the final, they might save some energy. The results in table 10 show significant deviations in the gradients in all parts of the race, and the most different distributions in the 1000-1500m part. This part is often more positive when no races follow this race on the same day and thereby satisfies the expectation. Fig. 14 illustrates this difference; the average strategy is much more declining, resembling the average strategy of heats more than the ones from the finals. This is probably because finals are not in this group, for there is no round after the final. Nevertheless there is a clear relationship between the number of races to come on the same day and strategies, and this feature will be used in strategy prediction.

(21)

(a) 0 (b) 1

Figure 14: The average strategies and the 95% confidence interval of the existence of a race after current race comparisons

6.4 Strategy prediction

To see whether it is possible to predict the used strategy based on the above mentioned, known in advance of the race, features, several supervised regression techniques are implemented and applied to the data. As all features are categorical, it is necessary to use a method that does not regard the classes as being continuous values. Methods like K-Nearest Neighbor Regression, Random Forest Regression and Multi-Layer Perceptron Regression are all fit for processing binary classes to continuous output. K-Nearest Neighbor Regression is based on distance to one or more previously seen examples, Random Forest Regression is based on decision trees and Multi-Layer Perceptron Regression is a Neural Network approach. For these techniques to function properly, the discrete class numbers of each feature are transformed to as many one-hot-vectors features as there are classes in the feature. For example, the boat size feature contains four boat sizes (1,2,4 and 8). For each boat size a new feature is generated containing a 1 in the rows where that boat size occurs and a zero in the others. Which results in 763 features of which 752 are one hot vectors representing a team. The co¨efficient of determination (R2) is used for validating the score. And is displayed in equation 3, where n is the number of instances, y is the vector of observed values and f is the vector of predicted values.

¯ y = 1 n n X i=1 yi SStot = X i (yi− ¯y)2 SSres= X i (yi− fi)2 R2= 1 − SSres SStot (3)

10-Fold cross-validation is applied to prevent the model from over-fitting, where the randomly sampled train data accounts for 70% of the data and the test data for the remaining 30% of the data. New samples are generated for each fold in the cross-validation.

6.4.1 Regression methods description and parameter determination

Below follow four paragraphs describing the Machine Learning techniques used for the prediction of the strategies and a explanation for the configuration choices.

(22)

(a) 50-500m (b) 500-1000m

(c) 1000-1500m (d) 1500-2000m

Figure 15: Effect number of neighbors on R2 score

K-Nearest Neighbor Regression K-Nearest Neighbor Regression is a method that compares the features of the new input (Xnew) to the features of the K closest neighbors in Euclidean space, calculated with the Euclidean distance as described in equation 4. Where p and q are two different individual boat races.

d(p, q) = v u u t n X i=1 (qi− pi)2 (4)

These neighbors consist of instances used for training (Xtrain) and contain corresponding labels (~ytrain), which are used to calculate the label for Xnew (~ynew) by averaging the K nearest instances of ~ytrain. This method is suitable for the current problem for it does not weigh the binary data, it only tries to minimize the number of places where there is an non-matching binary state. To see which number of neighbors produce the best result, some configurations are tried in a grid search, which is shown in Fig. 15. In all cases, the R2 score is stable when more then 10 neighbors are involved, which leads to the use of 10 nearest neighbors for all four gradient predictions. Usually the R2 declines again after a certain number of used neighbors, but this behaviour is not detected in these graph. This is a sign that the features are not informative enough for the model to do better then predicting the average. Also because the data is high dimensional and sparse, the difference between two neighbors is small, which causes the algorithm to perform worse.

(23)

(a) 50-500m (b) 500-1000m

(c) 1000-1500m (d) 1500-2000m

Figure 16: Effect number of trees on R2 score

Random Forest Regression Random Forest Regression consists of the generation of a number of decision trees all based on different bootstrap samples of the data. The splits on the nodes in the decision tree are generated on random samples of the features, choosing the split with the highest gain [20]. This highest gain is calculated on the data that is not in the bootstrap sample [20] using the Mean Squared Error, seen in equation 5, where fiis the ithpredicted value and yiis the corresponding observed error. M SE = 1 n n X i=1 (fi− yi) (5)

Random Forest is a model that is suitable for many kinds of data, such as binary data in this case, for it is able to generate a split on the binary value of a feature. It is not prone to overfit (make a too close fit to the training data and is therefore not able generalize well to the test data) and is not sensitive to outliers. An other benefit is the ability to demonstrate how much it uses each feature to generate a prediction, which gives more insight in the importance of features for strategy determination and makes the method less of a black box. The Random Forest Regression implementation from the python package sklearn is used [19], and to allow the model to be of the highest complexity possible, only the number of trees is adjusted. For each of the four gradients, an exhaustive grid search is done to determine the most suitable number of trees. The number of trees after which the incline in value is no longer significant will be used in the prediction. What Fig. 16 shows is a drastic increase in performance between 20 and 40 trees for the prediction of the first 500m, and fluctuations around 0.26 for a larger number of trees. The same goes for the middle two parts of the race, since the incline after 40 trees is on average 0.005, which is an insignificant difference. Since the system becomes slower with every additional tree, the most profit is gained from using 40 random trees in the prediction of the first 3 parts. The performance in the fourth part of the race fluctuates greatly, having a peak at 30 and 75 random trees. Since the peak at 75 differs only 0.008 from the one at 30, while doubling the number of trees, 30 trees are used in the model for the fourth part of the race.

(24)

Multi-Layer Perceptron Regression Multi-Layer Perceptron Regression is a neural network based regression method. A Multi-Layer Perceptron consists of several layers of neurons, a simple architecture can be seen in Fig. 17.

Figure 17: A Multi-Layer Perceptron architecture [18] The values of the features form the

in-put and are transformed by the network to an output that minimizes the cost func-tion. To determine this transformation, the model uses forward and backward propaga-tion. Forward propagation is the path from input layer to output layer, where the in-put of each neuron gets a weight assigned and is transformed by a non-linear activa-tion funcactiva-tion to the output of the neuron. This output is input for the next layer and this process repeats itself for every layer un-til the output layer is reached. When the output is reached the Squared Loss is calcu-lated generating feedback on the

adequate-ness of the fit. Back propagation is this process in reverse; from the output layer to the input layer, and instead of an activation function a propagation function transforms the output of a neuron. The process is repeated until the Mean Squared Error does not change anymore, and the best model for the data is found. The combination of both processes allows the network to learn. In this study a Multi-Layer Perceptron Regression by sklearn is used [17], which has a few parameters to optimize. The first is the optimizer, the second is the activation function, the third is the number of layers, and the fourth is the number of nodes per layer. Since the goal is an optimal prediction of the strategy, a Grid search is done on several configurations of these parameters. Where a Grid search is an ex-haustive search where a number of values are tried for each parameter. The number of layers is either 1, 2 or 3, to see if it makes sense to increase the number of layers. And the number of nodes per layer is in the range of 25-125, again to see whether increasing the number of neurons means a gain in performance. The Neural Network is chosen for its ability to adapt, but the downside is that it functions as a black box, it is hard to find out which features are more important for the prediction and which are not.

Optimizer The optimizer used for updating the parameters can be Stochastic Gradient Descent, Adam or Limited-memory BFGS. These are all forms of Gradient Descent used for the optimization of the parameters. Stochastic Gradient Descent uses stochastically sampled mini-batches from the data and iteratively minimizes the objective function [2]. This approach works well with large datasets and a large number of hidden layers. The current data set is large for its number of features, which is 763 after binarization, which makes Stochastic Gradient a suitable technique. The Adam optimizer is a more reliable function then Stochastic Gradient Descent, also suitable for large datasets and high-dimensional parameter spaces [14]. It uses the moving averages of the parameters which enables the algorithm to adapt the learning rate efficiently by itself. The learning rate is the parameter that defines how fast the optimizer moves to the optimal parameters. Too fast enlarges the chance of skipping the optimal solution and too small causes a large number of iterations. Currently Adam is used more often then Stochastic Gradient Descent, and is for the same reasons applicable to our dataset. The third optimizer is Limited-memory BFGS, which is a parameter optimization algorithm based on quasi-Newton methods and is especially suiting to problems with a large number of variables[1]. It often gets closer to the optimal value then Stochastic Gradient Descent, but is more expensive to calculate for it is a second-order method opposed to Stochastic Gradient Descent and Adam being a first order method. Since the aim is to make the best prediction possible this model is tested. The speed of the prediction is not a discriminating factor in this thesis, since we only have to do the prediction once.

(25)

Activation function The activation function is either the identity function (equation 6), the logistic function (equation 7)(also called the sigmoid function), the tanh function (equation 8) or the ReLu function (equation 9), where x is the weighted value from a neuron. The identity function is often used for the transformation from continuous input to a binary output, and is therefore not suitable to this problem. Both the logistic, the tanh and the ReLu function are used if the transformation from input to output needs non-linear capabilities. Tanh often converges (reaches an optimum) faster then the standard logistic function, but since time is not a constraint, both are tried. The ReLu function is often applied to networks with many hidden layers, but is also functional for networks with less layers and will therefore also be part of the parameter tests.

f (x) = x (6) f (x) = 1 1 + e−x (7) f (x) = 2 1 + e−2x − 1 (8) f (x) = max(x, 0) (9)

Optimizing Multi Layer Perceptron parameters When comparing the optimizers, Adam is the most profitable in the prediction of both the first 500m and the last 500m, while Limited-memory BFGS outperforms Adam in both middle parts, for it leads to prediction that is close to the mean of the data, which is an R2-score of 0. Based on these results Adam will be used in the first and final 500m and Limited-memory BFGS in both middle parts.

R2 Predicted gradient

Optimizer 50-500m 500-1000m 1000-1500m 1500-2000m

LMBFGS 0.110 -0.0970 -0.094 -0.259

Adam 0.293 -0.264 -0.267 -0.007

SGD -0.346 -3.165 -2.186 -0.483

Table 11: Effect of the optimizer on the R2 of the prediction of a gradient in a configuration of one layer with 100 neurons using the ReLu function

Activation function 50-500m 500-1000m 1000-1500m 1500-2000m

Logistic 0.350 0.025 0.014 0.124

tanh 0.324 -0.139 -0.215 -0.012

ReLu 0.293 -0.264 -0.267 -0.015

Table 12: Effect of the activation function on the results in a configuration of one layer with 100 neurons using the Adam optimizer

In the case of the activation function the results on all four parts are unambiguous, as can be seen in table 12, the logistic function outperforms all other functions, in three cases meaning that the prediction goes from negative to positive, which is equal to going from a prediction that is worse than predicting only the mean to one that is slightly better. Several configurations of neurons per layer and number of layers have been tried using the Adam optimizer and the Logistic activation function, and their results are shown in tables 13, 14, 15 and 16. When predicting the first part of the race, the best results are generated with 1 layer containing 125 neurons and will be used in the prediction. After 125 it does not gain performance by increasing the number of neurons. The second gradient gets the best results when either one layer of 50 neurons is used or 2 layers of 100 neurons. Since the first is less complex, this is the one that will be used in the prediction of the second gradient. The third gradient has the highest performance with two layers and 150 neurons per layer. Adding one more layer does not add to the performance. The fourth gradient is best predicted in a one-layer architecture whit either 50 or 75 neurons. As the first is less expensive, this one is used.

(26)

# Layers # Neurons per layer

25 50 75 100 125 150

1 0.249 0.272 0.295 0.309 0.312 0.308

2 0.060 0.145 0.166 0.171 0.234 0.250

3 0.006 0.009 0.002 0.010 0.012 0.00118

Table 13: R2 _{of the number of layers and the number of neurons per layer on the results for the} 50-500m prediction using the Adam optimizer and the logistic activation function.

25 50 75 100 125 150

1 -0.019 0.030 0.010 0.025 0.015 0.003

2 0.017 0.023 0.021 0.030 0.025 0.027

3 -0.001 -0.004 -0.001 -0.001 -0.008 -0.006

Table 14: R2 _{of the number of layers and the number of neurons per layer on the results for the} 500-1000m prediction using the Adam optimizer and the logistic activation function.

25 50 75 100 125 150

1 0.017 0.038 0.036 0.041 0.016 <0.001

2 0.028 0.037 0.040 0.030 0.045 0.048

3 0.002 0.002 -0.002 -0.004 -0.006 -0.005

Table 15: R2 of the number of layers and the number of neurons per layer on the results for the 50-500m prediction using the Adam optimizer and the logistic activation function.

25 50 75 100 125 150

1 0.115 0.124 0.124 0.114 0.107 0.112

2 0.048 0.064 0.070 0.072 0.107 0.078

3 0.002 <0.001 0.004 0.009 -0.003 0.004

Table 16: R2 of the number of layers and the number of neurons per layer on the results for the 50-500m prediction using the Adam optimizer and the logistic activation function.

6.4.2 Results strategy prediction

Regression technique 50-500m 500-1000m 1000-1500m 1500-2000m

K-Nearest Neighbor 0.096 -0.018 -0.102 0.058

Random Forest 0.249 -0.037 -0.153 -0.024

Multi-Layer Perceptron 0.347 -0.016 -0.002 0.121

Table 17: Strategy prediction results

When comparing the scores of the models for the four gradients, which is done in table 17, it becomes clear that the performance of the models is best for the gradients at 50-500m and 1000-1500m. This is not surprising as these parts show the most significant differences in all features comparisons in section 6.

(27)

The first 500 meter shows the highest R2 of 0.347 when the Multi-Layer Perceptron is used for prediction, which means that 34% of the variation in Y is associated with the boat size, round, team and number of races after the current race on the same day. Therefore these features seem to be considered by rowers when deciding for the strategy on the first 500m. Though it is not clear which of these features have the highest predictive value for the Multi-Layer Perceptron, Random Forest Regression shows that it gets the most predictive value from the binary feature of whether a boat is a one-manned boat or not. One-manned boats tend to have a significantly higher stroke rate at the beginning compared to their average stroke rate then other boat sizes, which explains the predictive value. This does not guarantee that this is also the most important feature for the Multi-Layer Perceptron, it only shows where the performance of the Random Forest Regressor is based on. The final 500m is the second best predicted gradient, the Multi-Layer Perceptron manages to get 12% of the gradients right. The most important feature in Random Forest Regression is whether the boat rows a Final or not and when looking at Fig. 7 in section 6.3.1 and Fig. 41 and Fig. 46 in appendix G there is a steeper incline in stroke rate in the finals than in all other rounds. Again, this does not guarantee that the Multi-Layer Perceptron based its model on the same features as the Random Forest Regressor.

The gradients describing the middle part of the race are the least predictable. This is probably because the standard deviation of the middle two gradients is smaller than the one of the outer two gradients, which is shown in Fig. 4 in section 2. The results show that the deviations from this mean are not based on the features that are used for prediction, meaning that they are caused by features that are not known before the start of the race. For example, the position of the other rowers in the field, the weather and psychologically based decisions (like giving up because they lose the hope of winning). The same goes for the first and final 500m, too much of the outcome is unexplained and therefore caused by features that are not considered in this research. Since the features have shown to be influencing the strategy used in the first and last 500m of the race, separate models will be generated in the prediction of the outcome of the race. Keeping them consistent can increase the performance for the variance caused by the difference in boat size, round or number of races after will be filtered out.

7 Relation between strategy and final rank

7.1 Method

To answer the question whether some strategies are more likely to generate a win than others, and to identify these strategies, an analysis is done on the relation between strategies and ranks. This is done solely at the ranks and not at times, because environmental factors are of influence on the time, creating an unfair difference between two separately rowed races. These factors consist of weather, current and water temperature which can be significantly different on different days and even throughout one day, and unfortunately are not represented in the data. Because the rank is a discreet representation of time in one race, the fastest gets rank one and the slowest gets rank 6, it defines the performance of a team the best, without being influenced by environmental factors. After all, rank 1 remains to be rank 1 even if the fastest time was slower because of an unfortunate wind direction. Also it is more important to get the first rank then what the difference in time is to the one finishing second. First the relation between strategy and rank is defined using the correlation and t-test between the separate gradients of the strategy belonging to different ranks. This will answer the question whether the separate gradients influence the strategy and which gradients are the most influential. Afterwards regression methods are applied to predict the rank based on the combination of the four gradients. This will show the influence the strategy as defined in this thesis has on the rank. Because there are so many differences between rounds, boat sizes and number of races after the current race, the ranks will be analyzed independently of and dependent on these features. The general approach ignores the race specific features and checks whether there are strategies that lead to one rank regardless the race

(28)

type. The dependent approach checks if the strategies are more distinctively different between ranks when the influencing features are kept constant. Several different feature configurations are used for this, where an example of such a configuration is; boat size is 1, round is Heats and number of races after current race is 1. The expectation is that when analyzing the strategies independently of these features grouped per rank, there is a larger spread in the strategies that are used, then when looking dependently of these features. Though team is also a feature with a large influence, it is not considered as a dependent feature here, for it leaves too small group sizes to base conclusions on.

7.2 Correlation

The correlations are calculated to see whether there is a general relation between the rank, which is a discreet number between 1 and 6 with an ordinal relation, and the four strategy gradients, which are continuous values. Again, the Spearman correlation is the most suiting correlation method for the possibility to handle the discreet classes of ranking.

7.2.1 Results independent from rounds, boat sizes and # of races after on the same day

Gradient Correlation with rank p-value

50-500m 0.036 0.008

500-1000m -0.050 <0.001

1000-1500m -0.136 <0.001

1500-2000m -0.044 0.001

Table 18: Correlation between rank and gradient The results in table 18 show that the

correla-tions are small, and the only gradient that has a absolute correlation of higher then 0.1 is the 1000-1500m part. This low correlation sug-gests that the relation between strategy and rank is weak and predicting the rank using the strategy probably wont lead to good re-sults. Nevertheless it is interesting to see what differences can be found when comparing the

ranks, especially in the 1000-1500m gradient, and see how high the predictive value of a strategy is.

7.2.2 Results dependent on rounds, boat sizes and # of races after on the same day To see whether the correlation between the strategy and the rank increases when looking per feature configuration, Spearman correlation is applied on the groups of races belonging to different configu-rations of rounds, boat sizes and number of races after the current race on the same day. Only the configurations with a higher sample size then 30 are used. A smaller sample would contain less then 5 indices of each rank, which is too little to base a conclusion on. In table 18 the results can be seen per feature configuration. this table only contains the feature configurations where significant results are found. More information on the rest of the configurations is in appendix H. The highest correlation is the correlation of -0.505 in the 1000-1500m part of the configuration of a four manned boat in the heats with one race after the current race. In half of the cases, if a crew decreases their stroke rate in this part of the race, the rank increases. This is in line with the results found in 6.3.5. the second highest correlation is a correlation of -0.296 in the 1500-2000m part of the 1H0 configuration, meaning that sometimes if the gradient decreases, the rank increases. This means that if a crew increases its stroke pace at the end of the race, they have a higher chance of reaching a lower rank. Each of the configurations has a significant result in either the 1000-1500m part or the 1500-2000m part, making them the most informative on which rank is reached. 1H0 is expected to be the configuration in which the rank is best predictable, for it contains the largest number of significant values. Most of the configurations has at least one correlation that is bigger then the -0.136, which is the highest correlation in the general approach. Therefore it is expected that these separate configurations have more distinction in strategy between the ranks.

(29)

Feature Correlation to Rank p-value 50-500m 0.044 0.437 500-1000m -0.175 0.002 1000-1500m -0.111 0.048 1500-2000m -0.296 <0.001 (a) 1H0

Feature Correlation to Rank p-value

50-500m 0.025 0.647

500-1000m -0.070 0.210

1000-1500m -0.233 <0.001

1500-2000m -0.113 0.042

(b) 1S0

50-500m 0.074 0.112

500-1000m -0.045 0.330

1000-1500m -0.118 0.011

1500-2000m -0.085 0.067

(c) 1F0

50-500m 0.030 0.471

500-1000m -0.029 0.472

1000-1500m -0.159 <0.001

1500-2000m 0.017 0.679

(d) 2H0

50-500m 0.018 0.698

500-1000m 0.051 0.272

1000-1500m -0.101 0.030

1500-2000m -0.016 0.737

(e) 2R0

50-500m 0.040 0.333

500-1000m 0.025 0.545

1000-1500m -0.272 <0.001

1500-2000m -0.197 <0.001

(f) 2S0

50-500m 0.048 0.231

500-1000m -0.026 0.511

1000-1500m -0.047 0.237

1500-2000m -0.110 0.006

(g) 2F0

50-500m -0.125 0.064

500-1000m -0.216 0.001

1000-1500m -0.199 0.003

1500-2000m -0.085 0.209

(h) 2H1

50-500m 0.087 0.118

500-1000m -0.028 0.622

1000-1500m -0.204 <0.001

1500-2000m -0.204 <0.001

(i) 4H0

50-500m 0.020 0.764

500-1000m -0.075 0.257

1000-1500m -0.077 0.245

1500-2000m -0.162 0.013

(j) 4R0

50-500m 0.080 0.311

500-1000m 0.050 0.540

1000-1500m -0.073 0.353

1500-2000m -0.257 <0.001

(k) 4S0

50-500m 0.111 0.086

500-1000m -0.065 0.319

1000-1500m -0.026 0.687

1500-2000m -0.232 <0.001

(l) 4F0

50-500m -0.318 0.025

500-1000m -0.089 0.540

1000-1500m -0.505 <0.001

1500-2000m -0.200 0.163

(m) 4H1

Strategy analysis and rank prediction in 2000m rowing tournaments

MSc Artificial Intelligence

Master Thesis