Wetenschappelijk artikel - Machine learning methodes voor het voorspellen van VO 2max uit sub-m

Arne De Brabandere¹, Tim Op De Be´eck¹, Kurt Sch¨utte², Wannes Meert¹, Benedicte Vanwanseele², Jesse Davis¹

1 Department of Computer Science, KU Leuven, Leuven, Belgium 2 Department of Kinesiology, KU Leuven, Leuven, Belgium

Abstract

Maximal oxygen uptake or VO2max is often used to assess an individual’s aerobic fitness level. However, measuring this variable requires a maximal exercise test which can not be performed regularly by athletes. The aim of this study is to develop a new model for predicting VO2max by using the relation between heart rate and accelerometer features extracted from submaximal running. 31 recreational runners (15 men and 16 women) aged 19-26 years performed a maximal incremental test on a treadmill. During this test heart rate and acceleration at three locations (upper back, lower back and left or right tibia) were measured. Various features were extracted from the measurements of the warm-up stage and the first three stages. We selected a small subset of these features using a data-driven approach. Combining heart rate and accelerometer features resulted in a model with an explained variance (R²) of 0.784. The model requires four features:

gender, body weight, reciprocal of average heart rate and reciprocal of standard deviation of total tibia acceleration during the warm-up stage of the treadmill test. The prediction model can be used for a practical tool to predict VO_2max from two

body-worn sensors (a heart rate monitor and an accelerometer) during submaximal running on a treadmill.

Introduction

In endurance sports such as distance running and cycling there is a large interest by coaches and sport scientists to keep track of the fitness level of athletes for both inter and intra athlete comparison. Cardiorespiratory fitness is often measured in terms of maximal oxygen uptake (VO2max). This variable is defined as the maximal rate at which an individual can consume oxygen during exercise and is expressed in ml · kg⁻¹· min⁻¹. VO2max is one of the important determinants of endurance performance. Although endurance performance is also determined by the fractional utilization of VO2max(lactate threshold) and economy of movement [1], maximal oxygen uptake sets the upper limit for it [2].

Maximal oxygen uptake can be measured during running by performing a maximal incremental test on a treadmill. However, protocols for measuring VO2max are

physically demanding. Moreover, it is not always possible to incorporate maximal running tests in strict training regimen of endurance athletes. Besides these physical limitations there are also several practical limitations of maximal running tests. To perform such tests a lab setup is required with specialized equipment used by trained staff. Therefore VO2maxtesting is often too expensive for non-elite runners. Another

important limitation is that a maximal exercise intensity is reached in the tests. This entails a risk for older individuals and people with a heart condition.

Because of these limitations various models have been proposed for predicting maximal oxygen uptake from submaximal exercise. An extensive overview is given by Abut et al. [3]. We present a summary of the different types of variables used in existing models and point out possible improvements.

Descriptive variables are used in all of the models presented in the overview of Abut et al. Gender and age are the most common variables, usually in combination with body weight, height, BMI or a combination of those. These variables are necessary for predicting VO2max and will therefore be used in this study as well.

Subjective variables are often added to the descriptive variables in models based on submaximal running. Some models make use of a self-selected running or walking speed and measure heart rate as well as time or distance after exercising for a set distance or time at that speed [4–6], or use the self-selected speed directly [7]. Other models are based on subjective questionnaire variables such as perceived functional ability [8–11].

The disadvantage of these variables is that choices of the individual can negatively influence the results. Therefore we argue that only objective variables should be used for the prediction of maximal oxygen uptake.

Objective variables are used in related work by combining heart rate and variables derived from accelerometers. Weyand et al. [12] found that the relation between foot-ground contact time and heart rate during running predicts maximal oxygen uptake. They proposed a linear regression model using two features: gender and

‘aerobic fitness index’ (AFI), defined as (tc· HR)⁻¹where tc and HR are respectively contact time and heart rate averaged over a few minutes of running at an arbitrary speed. T¨onis et al. [13] developed a model using the relation between heart rate and

‘level of activity’ – defined as the sum of the integrals of the absolute value of

acceleration for the three accelerometer axes – during walking at two different velocities.

The ideas in these studies for combining heart rate and accelerometer data can be extended by using more accelerometer features. Apart from contact time and level of activity, many other features can be calculated from accelerometer data. The goal is then to find a set of features that are relevant for predicting maximal oxygen uptake.

In this study we hypothesize that VO2max can be predicted from submaximal running based on the combination of heart rate and accelerometer features selected with a data-driven approach. Unlike the models of Weyand et al. and T¨onis et al. this approach can extract more information from accelerometer measurements since it does not use a single pre-defined variable (contact time or level of activity) but uses an automatic feature selection method to find other relevant variables.

To test this hypothesis, different features are calculated from heart rate and accelerometer measurements collected during the first stages of an incremental treadmill test. We construct prediction models based on different types of features to evaluate how the results improve when combining heart rate features and accelerometer features compared to using these features separately. To construct these models, we propose an automatic feature selection method to find relevant features.

The main contributions of this paper are summarized as follows:

1. we present a data-driven approach to automatically extract relevant features for the prediction of VO2max during submaximal running;

2. we show that combining heart rate and accelerometer features with this approach results in an accurate prediction model.

Subjects and protocol

31 recreational runners including 15 men and 16 women aged 19-26 years volunteered to participate in this study. Only subjects who had been running regularly and had prior experience with treadmill running were eligible to be included in the study. All subjects were screened to have no known history of metabolic, neurological or cardiovascular disease, or surgery to the back or lower limbs, and were symptom-free of any lower extremity injury for at least six months prior to the study. All runners provided written informed consent prior to participation in accordance with the Declaration of Helsinki.

The local ethics committee of Stellenbosch University approved the study (#SU-HSD-002032).

Each subject performed one or two maximal incremental running tests to exhaustion on a motorized treadmill (Saturn h/p/cosmos, Nussdorf-Traunstein, Germany) at 1%

slope. The test started at a running speed of 8 km · hr⁻¹for women or 9 km · hr⁻¹ for men. A warm-up of 4 minutes equivalent to starting speed was first provided, after which treadmill speed was increased discontinuously in increments of 1.5 km · hr⁻¹every four minutes interspersed by a one minute rest until volitional exhaustion. An example of the protocol is shown in Fig 1. Participants could run in their own relatively new (within three months of use) conventional shod running shoes. The treadmill gradient was fixed at 1% throughout the submaximal assessments to reflect the energetic cost of outdoor running [14]. All tests were performed under similar laboratory conditions (20-25°C, 50-60% relative humidity at 130m of altitude). Each subject reported a rating

of perceived exertion score [15] immediately after each stage. The pulmonary gas exchange was recorded throughout the incremental test using a breath-by-breath metabolic analyzer (Cosmed Quark CPET, Rome, Italy). The gas analyzers were calibrated before each session to 16% O2, 4% CO2balance N2and the turbine flow meter was calibrated with a 3L calibration syringe before each test. Heart rate (HR) was recorded by a heart rate monitor (Cosmed). Acceleration was measured using wearable inertial measurement units (Shimmer3 wireless IMU, sampling rate 1024Hz, range ±16g, Dublin, Ireland) at four locations: upper back, lower back, and left and right tibia. The upper back accelerometer was aligned between the shoulder blades at the level of the C7-T2 spinal processes. The lower back accelerometer was aligned between the posterior superior iliac spines at the level of the L3-L5 spinal processes, and the tibial accelerometers were aligned on the antero-medial aspect of the distal tibia, 8cm above the medial malleolus. Participants were fitted with an adjustable safety harness during the entire treadmill test. Runners were considered to have achieved VO2max when at least two of the following criteria were fulfilled:

1. a plateau in the VO2as defined by an increase of less than 1.5 ml · kg⁻¹· min⁻¹in two consecutive stages;

2. a respiratory quotient (R-value) > 1.15;

3. a maximal heart rate value (HRmax) > 95% of the age-predicted maximum (220 − age);

4. a rating of perceived exertion (RPE) ≥ 19 on the 6-20 Borg scale.

For the analysis six treadmill tests were excluded from the data set. The heart rate measurements of four tests showed an irregular pattern that was probably caused by a badly connected heart rate strap. In two other tests the accelerometer data was not recorded properly. The descriptive characteristics of the participants of the remaining treadmill tests are summarized in Table 1.

Fig 1. Example of the protocol for a male runner reaching stage 6.

Cool-down

Stage 6

Stage 5

Warm-up Stage 1 Stage 2 Stage 3 Stage 4 4

Table 1. Descriptive characteristics of the subjects. Notation: mean ± SD

Men Women All

Number of subjects 12 16 28

Number of tests 16 25 41

Age(years) 22.06 ± 2.11 21.56 ± 0.70 21.76 ± 1.44

In this section we describe the data measured during the treadmill tests that will be used to calculate features for the prediction models.

Heart rate Heart rate measurements collected during the test were averaged every ten seconds. The averaged signal was smoothed using a median filter where the value of each 10-second window is set to be the median of the window itself, and the three windows before and after it.

Acceleration The locations of the accelerometers attached to the runners’ bodies (upper back, lower back, and left and right tibia) are shown in Fig 2. In some tests, one

of the tibia accelerometers fell off. Therefore only data from one tibia accelerometer is used and the left one is selected by default if both were available. The accelerometer measurements were sampled at 1024 Hz. To remove noise, a low-pass filter with a cut-off frequency of 50 Hz was applied to the signals, which is high enough to capture characteristics of running patterns. To make sure that the axes of the accelerometers were rotated correctly, the Moe-Nilssen tilt correction method [16] was used to align the axes with the anterior-posterior, mediolateral, and vertical direction of the runners.

Oxygen uptake To measure VO2max, oxygen uptake (VO2) was measured

continuously during the test. Maximal oxygen uptake was calculated as the maximum value of the floating average of the VO2signal with a window length of 30 seconds.

Feature engineering

Three types of submaximal features are used in this study: descriptive features, heart rate features and accelerometer features. Since the goal is to develop a model for predicting VO_2max from submaximal exercise, the latter two types of features are extracted from the warm-up stage and the first three stages of the test. All subjects were able to reach the third stage which was performed at 12 km · hr⁻¹ for men and 11 km · hr⁻¹for women. The remainder of this section describes which features are extracted. Table 2 gives an overview of these features.

Descriptive features All models listed in the overview of Abut et al. [3] use (a subset of) gender, weight, length, BMI and age. This study considers two of these features: gender (G: 1 = male, 2 = female) and body weight (BW), which are known to be relevant for predicting VO_2max. Given the relatively small age range of the subjects (19-26 years) and that the maximal oxygen uptake decreases approximately 0.2-0.5

ml · kg⁻¹· min⁻¹per year [17], age is not considered.

Heart rate features From the heart rate measurements we calculate average heart rate (HR) and the reciprocal of average heart rate (HR^-1). These features are computed for the warm-up stage and the first three stages of the test. Because heart rate dropped during the rest periods, the average is computed only over last minute of each stage where heart rate was more stable.

Accelerometer features Two classes of features are calculated from the

accelerometer measurements: statistical features and phase plot features. The goal is to extract variables related to running economy and oxygen consumption (VO2) and use these variables in combination with heart rate for predicting maximal oxygen uptake as individuals with a lower VO_2maxare expected to have higher heart rates for similar VO₂ levels [18].

Four statistical variables are calculated: mean, standard deviation, root mean square (RMS) and energy. RMS is often used in studies related to running gait analysis [19, 20].

The other variables are commonly used to describe movement patterns based on accelerometer measurements.

Other features are extracted from the phase plot of the acceleration signal. A phase plot is constructed by plotting acceleration on the horizontal axis and jerk (derivative of acceleration, expressed in g · s⁻¹) on the vertical axis. An example is shown in Fig 3.

Three features are extracted from these plots: width, height and ratio. Width and height are defined as the distance between the 90^thand 10^thpercentile of the points in the horizontal and vertical direction respectively. Ratio is defined as width divided by height.

Each variable is calculated for the anterior-posterior, mediolateral, vertical and total acceleration signals measured at the upper back, lower back and left (or right) tibia.

Because the treadmill accelerated and decelerated at the start and end of each stage, the first and last ten seconds of each stage are discarded.

Fig 3. Example of a phase plot. Left: example of the total acceleration signal at the lower back. Right: phase plot of this signal.

Table 2. Overview of all features.

Category # Features

Descriptive 2 Gender (1 = male, 2 = female) G

Body weight BW

Heart rate 8 Average HR^(-1)_i

Acceleation 772

Average AVG^(-1)_a,d,i

Standard deviation SD^(-1)_a,d,i Root mean square RMS^(-1)_a,d,i

Energy E^(-1)_a,d,i

Phase plot width PPW^(-1)_a,d,i Phase plot height PPH^(-1)_a,d,i Phase plot ratio (width/height) PPR^(-1)_a,d,i Notation: stage number i = 0, 1, 2, 3 (stage 0 is the warm-up stage); accelerometer location a = t, bl, bu (t = left or right tibia, bl = lower back, bu = upper back);

direction d = x, y, z, total (x = anterior-posterior, y = mediolateral, z = vertical); the superscript (-1) means that both the feature and its reciprocal are included.

Prediction models

We can use different combinations of the three types of features presented in the previous section. To find out which combination works best for predicting maximal oxygen uptake, we compare four models using different combinations of features. The

again from the descriptive features of S1and add accelerometer features. Heart rate and accelerometer features are then combined in the fourth model (S₄) by extending S₂ with accelerometer features.

We will use least squares linear regression for training the four models. This method learns the weights [w0 w1... wn] of a linear function

y = w0+ w1· x1+ ... + wn· xn

such that the error ||ˆy − y||²₂is minimized, where y and ˆy are vectors of length m containing respectively measured and predicted VO2maxvalues for m data points and [x1 ... xn] is an m × n matrix of feature values.

Feature selection

A common problem with linear regression is that this method is sensitive to

multicollinearity in the input features. This problem must be addressed here since the accelerometer features are highly intercorrelated: there are for example correlations between different variables (e.g. RMS and energy) and between features across different stages, accelerometers and accelerometer axes.

To solve this problem, one could use regularization techniques such as ridge [21] and lasso [22] instead of least squares regression. However, both methods only minimize the weights – the L₂-norm (ridge) and L₁-norm (lasso) of the weight vector w – and do not exploit the relationship between heart rate and accelerometer features. Therefore we need a method for selecting uncorrelated features that are relevant for the prediction of maximal oxygen uptake.

We use a greedy forward selection method for adding features in S2, S3 and S4. In each of these models we start from an initial feature set F and add features one by one.

The initial features in S2and S3 are the descriptive features of S1while in S4these are the descriptive and heart rate features of S₂. Each new feature f is evaluated by the adjusted explained variance (R²_adj) of a least squares linear regression model with F ∪ {f } as input features. The evaluation is done on the training set using

leave-one-subject-out cross-validation (the next section explains this validation technique in more detail). The best feature f^∗is added to the model if R²_adjis improved by at least 0.05 compared to the model without the new feature. This is repeated until R²_adj increases less than 0.05 and no features can be added anymore. In the resulting feature set F ∪ {f₁^∗, ..., f_n^∗}, each selected feature f_i^∗is one of the features from Table 2. Note that this set contains features of different categories and possibly from different stages.

Evaluation method

We use leave-one-subject-out cross-validation (LOSOCV) for the evaluation since the data set is too small to be split in a fixed training and test set. Using LOSOCV the test data in each fold consist of one or two data points depending on whether the test subject completed one or two treadmill tests. All other data points are used for learning a model. The predicted VO_2max values are evaluated by explained variance (R²) of the model. We also calculate mean absolute error (MAE) and root mean squared error (RMSE) expressed in ml · kg⁻¹· min⁻¹to compare the errors of the models. These

evaluation metrics are defined as follows:

where y are the measured VO2max values (with average ¯y) and ˆy are the predicted values for the N = 41 treadmill tests.

Results

Comparison of the four models

Fig 4 shows how the predicted VO2max values fit the measured values for the four models (S1, S2, S3and S4). The results show that S4– the combination of descriptive features, heart rate and accelerometer features – predicts VO2max better than the other models. This model has an explained variance (R²) of 0.784. The predictions of S4are more accurate than those of S2(using descriptive features and heart rate features) and S₃(using descriptive features and accelerometer features). The R²values of these models are 0.692 and 0.530 respectively. Both S2and S3 are in turn more accurate than S₁, which has an R²value of 0.494. This comparison shows that using accelerometer features improves the predictions compared to using only descriptive features and heart rate features, but the accelerometer features need to be combined with heart rate.

Apart from explained variance, Table 3 also reports mean absolute error (MAE) and root mean squared error (RMSE) for each model.

The linear functions learned for S1, S2, S3 and S4are shown in Table 4. These functions are learned from the complete data set. The best model (S4) uses four features: gender, body weight, reciprocal of mean heart rate during the warm-up stage and reciprocal of the standard deviation of total tibia acceleration during the warm-up stage.

Table 3. Comparison of the four models. R²= explained variance, MAE = mean absolute error and RMSE = root mean squared error (MAE and RMSE are expressed in ml · kg⁻¹· min⁻¹).

Comparison with the model of Weyand et al.

Weyand et al. [12] proposed a feature called ‘aerobic fitness index’ (AFI) defined as (tc· HR)⁻¹for combining foot-ground contact time and heart rate. They measured

foot-ground contact time via an accelerometer placed on the foot. In this study no

In document Machine learning methodes voor het voorspellen van VO 2max uit sub-maximale inspanning (pagina 65-76)