Modeling Cumulative Impact Part III

Model the performance of five elite swimmers based on past training intensity

A swimmer performing freestyle, from Wikipedia

U ntil now, this series has studied simulated data from a physiological model, the “fitness-fatigue” model of athletic performance. While the model is specific, the insights gained about cumulative impact are generally applicable. Recall thatPart I illustrated the use of convolutions with exponential decay, whilePart II generalized the convolving functions using splines.

While simulated data is an excellent tool for understanding methods, it has provided us with a few unnatural advantages (e.g., in choosing parameter starting values and spline knot placements). This article applies the methods from Parts I and II to a real data set and involves transformation, trial and error, and a bit of intuition. We’ll also calculate standard errors since it’s no longer possible to “see that the values have been recovered.”

The data, kindly provided by Christian Rasche (working group of Prof. Mark Pfeiffer, Institute of Sports Science, Johannes Gutenberg-University Mainz ) and studied by Kolossa et al. [1], comes from an experiment lasting 160 days, where five swimmers trained both in water and on dryland and performed a weekly swimming test for speed. In water, training was quantified as the swimming intensity (1, 2, 3, 5, or 8) multiplied by distance swam at each intensity. On land, one hour of training was converted to 2km of swimming distance and weighted by the type of training (2 for endurance, 5 for conditioning, 8 for strength). Both water and land training were accumulated to the daily level, and this dose ranges from exactly zero (i.e., a rest day) to 48.6 units in the data set.

Raw performance is mean velocity (m/s) of three repetitions of a swimming test, where resistance is added at each repetition and the swimmer must cover 20 meters without a wall start. Unlike the hammer thrower from [2] (and studied in the first two parts of this series), there is not a test everyday and hence there are “missing values” in the response column. Slightly processed from their original form, below are the first rows of data for Swimmer 1.

day training_dose mean_velocity
1   16.02         1.241971153
2   19.31268      NA
3   9.03          1.250052057
4   12.91618      NA
5   0             NA
6   9.6657        NA
7   5.87          1.329180349

W hat’s in a performance metric? Both throwing distance and swimming speed are bigger-is-better type variables and would work in their raw forms. Parts I and II, however, used a transformed performance metric known as the “criterion point.” For the criterion point metric, 0 aligns to the performance of an able-bodied individual, 1000 aligns to the world record performance, and improvements near the human limit count for more than improvements at a more ordinary level.

Christian Rasche (who provided the data) mentioned that the use of the criterion point transformation was “debatable” in the current context. Noting his caution, we will use it here. The first reason is to achieve continuity from Parts I and II, which used the criterion point for the sport of hammer throwing. The second reason is that the transformation is a topic of its own and a worthy addition to any data science toolbox.

In the 1990 article Modeling human performance in running [3], an approach to quantifying performance measurement is based on the observation that over time, world track records approximately follow the form y = L + a exp (-x / b), approaching an asymptote L at a rate determined by the time constant b . The parameter a can be made positive or negative to switch between bigger-is-better (e.g., throwing) or smaller-is-better (e.g., timed running events) type scoring.

Figure 1. World record progression for the men’s 200 m. By Ar558 — Own work, CC BY-SA 3.0

To get a transformation on the raw performance y , the authors suggest solving for the time in the equation above, i.e., x = b ln( a /( y — L )). It’s strange when interpreted literally: an athlete’s performance is the time in human history when it would have constituted the world record. The authors call this transformed quantity the “criterion point,” or “Cp.”

There is a procedure to get a , b and L. First, set L to be the hypothesized limit of human performance, leaving two unknowns a and b . Then create two equations using x = b ln( a /( y — L )). The first sets the left hand side to 1000 and substitutes the world record performance for y . The second sets the left hand side to 0 and substitutes the able-bodied performance for y . With two equations and two unknowns, this system can be solved for a and b. It is a nonlinear system, however, and Code Block 1 below contains a solver R function that takes as inputs the world record, able-bodied and human-limit performances and returns a list that includes a and b .

Code Block 1. An R function to solve for a and b given assumptions about the world record, able-bodied and human-limit performances. Supports types “running” (smaller is better) and “jumping” (larger is better).

There’s no way to know even the world record for this very specific swimming test, but the assumptions of a 2.0 meters per second human limit (remember that resistance is added), a 1.8 meters per second world record, and a 0.5 meters per second able bodied performance lead to a scale such that our five swimmers collectively perform near the middle of the 0 to 1000 range (similar to the hammer thrower studied previously). Using these inputs with the R function provided in Code Block 1, we arrive at a = -1.5 and b = 496.3.

get_perf_params(world_record_perf = 1.8,
                able_bodied_perf = 0.5, limit = 2.0,
                type = "jumping")

Code Block 2 below also uses get_perf_params after reading in data from the raw data files of the five swimmers. It also transforms their measurements to the criterion point (Cp) scale and plots the data.

Code Block 2. Reading in data, followed by transformation and plotting.

Figure 2. Transformed performance measurements and daily relative training intensities for the five swimmers.

When comparing the above graphs to our hammer thrower’s in Parts I and II, two points stand out:

The performance trends are more subtle than our hammer thrower, who experienced fairly dramatic swings in performance. Swimmer 1’s improvement throughout the study is the easiest trend to discern.
The performance measurements are less frequently collected than for the hammer thrower, and some swimmers are measured more frequently than others.

T he fitness-fatigue model provides a helpful starting point due to the structure it imposes, one which we’ll benefit from when considering spline knot placement later in the article. Code Block 3 fits the model for each swimmer separately, produces parameter estimates (Table 1), standard errors (Table 2), and a plot of the fitted values overlaying the data (Figure 3).

Code Block 3. Fitting the fitness-fatigue model for each swimmer, and plotting the fitted values.

Since there are missing values in the performance variable, the rss function (line 30) must now use the option na.rm = TRUE when computing the sums of squares. All daily training intensity data is available and the computation of the convolution variables proceeds exactly as before.

Lines 36 to 60 are ad hoc functions for computing nonlinear least squares standard errors. They work by approximating the n x p matrix

which enters the variance formulas just like the X matrix in ordinary linear regression.

As before, we’ll need starting values. On line 72 of Code Block 3, starting values from the previous articles are commented out in favor of those on line 73. When initialized at the former values, Swimmers 1, 2, and 4 converged to a solution with much smaller fitness and fatigue time constants than the hammer thrower, suggesting faster dynamics for the swimmers in general. Swimmers 3 and 5, on the other hand, converged to strange places (e.g., time constants of nearly 2000). Changing to smaller starting time constants didn’t change the solutions for Swimmers 1, 2, and 4, resulted in more intuitive solutions for Swimmers 3 and 5, and are thus used for the remainder of this article. These solutions are shown below in terms of the parameter estimates (Table 1) and approximate standard errors (Table 2).

Table 1. Fitness-fatigue model solutions for Swimmers 1–5.

Table 2. Fitness-fatigue model approximate standard errors for Swimmers 1–5.

In an email discussing his data set, Christian Rasche mentioned that the swimmers in this study were considered “elite.” He explained that, since elite athletes already achieve high levels of performance and are well-accustomed to training, they may not exhibit the typical variations in performance. When there are clear performance improvements, his rules of thumb are that “the fatigue delay (time constant) needs to be bigger than the fitness delay (roughly about two to three times as big) and the fatigue-to-fitness ratio needs to be > 1.”

For Swimmer 1, having made at least some visible improvement throughout the study, the estimated fitness time constant is the second largest among swimmers at 34.5 days (still considerably shorter than the hammer thrower’s at 60 days). The estimated fatigue time constant is very small at 1.35 days (only a tenth of the hammer thrower’s), suggesting very quick recovery from intense training. Despite a very fast decay, the fatigue weight is 4.6 times larger than the fitness weight (compared to 2.6 times for the hammer thrower). Standard error calculations in Table 2 suggest some quality to these estimates, though standard errors for the fatigue parameters are relatively high compared to the others.

Swimmer 3 has the largest estimated fatigue time constant of all the swimmers, just barely surpassing surpassing the hammer thrower’s at 13.4 days. The fatigue weight is greater than the fitness weight by only 1.5 times, diminishing the consequence of the longer fatigue time constant. The estimated fitness time constant is 23.5 days. The standard error estimates for Swimmer 3’s parameters are quite large for all parameters except the intercept.

Swimmer 4 has a minuscule fitness effect according to the model fit, and the slightly positive fitness weight is dominated by the fatigue weight (by a factor of almost 30!). Fatigue again dies off very fast, and the estimated fitness time constant is the longest of the five swimmers at 46 days. With the exception of the fitness time constant, Swimmer 4’s standard errors are relatively small, suggesting some degree of precision for the various parameter estimates.

Swimmers 2 and 5 have very short dynamics for both fitness and fatigue. The estimated fitness and fatigue effects are almost equal for Swimmer 5, who also has the fewest test data points. The standard error estimates for these two swimmers are ridiculously large. Note especially Swimmer 5’s standard errors for fitness and fatigue weights, which illustrates how little distinct information there is about these two effects.

Acknowledging the limitations of the inference here, the fitness and fatigue dynamics seem to be happening on a faster scale for these elite swimmers than for our collegiate hammer thrower. In 4 out of 5 cases, the fatigue effect dissipates very quickly. The relative strength of the fatigue effects to fitness effects varies considerably between the model fits, but in all cases (even Swimmer 5’s, barely), the fatigue weight is larger than the fitness weight.

Figure 3 below shows the plotted predictions from the model fits described above. The model is able to pick up the positive trend for Swimmer 1, but does not capture much other interesting variation. Swimmer 2’s model does predict a slight performance increase after the heavy training spike around day 60, which looks like it may have occurred (or it could be a coincidence). There’s not a lot to see for Swimmers 3–5.

Figure 3. Transformed performance measurements and daily relative training intensities for the five swimmers overlaid with fitness-fatigue model fitted values.

S plines can provide additional flexibility by directly estimating the lag distribution that is convolved with the training data. We saw this inPart II, but now it is time to apply it to real data; Code Block 4 implements the method separately for each swimmer.

Code Block 4. Implementing the spline-based convolution approach for the five swimmers and plotting both the estimated impulse responses and the fitted values.

Since the regression procedure ejects rows with missing values by default, the procedure runs without modifications. Note the selection of knot placements on line 6. Interior knots at 2 and 6 days capture the short term dynamics suggested by the fitness-fatigue fits, while the knot at 25 days is there to accommodate the longer term dynamics observed for Swimmer 1. The fitted impulse responses are shown in Figure 4.

These are the impulses affecting performance over time due to one training session at day 0. In a classical setting, we’d expect it to start negative as the fatigue effect initially overpowers the fitness effect, to reach a positive maximum as fatigue dissipates, and to slowly decay to zero as the fitness effect decays. See Part II for more details.

Figure 4. The fitted impulse responses for the five swimmers. Knots are represented by the crosshairs on the green horizontal line.

In Figure 4, immediately apparent are the relatively weak fitness effects, which is not surprising given the swimmers’ elite status. Swimmers 3 and 4 have no obvious fitness effect at all, and Swimmer 3 doesn’t even show much of a fatigue effect. Swimmer 1 has the impulse response profile that most resembles the classical fitness-fatigue profile with a relatively long fitness decay profile (the knot placed out at 25 days was for Swimmer 1).

Swimmer 2’s estimated impulse response features some artifacts that can be reduced with custom knot placements (i.e., removing the knot at 25 days). It is notable that Swimmer 2’s fitness effect is the most pronounced of the five swimmers, although it is short lived. As a speculation, perhaps this is a different “fitness” phenomenon than Swimmer 1’s.

Given the awkward parameter estimates from Swimmer 5’s fitness-fatigue fit, it is surprising that the impulse response looks as similar as it does to the others (especially Swimmer 2’s).

Below, we can see the performance predictions made by the spline-based convolution approach (in orange) together with the parametric fitness-fatigue model’s predictions (in blue). The predictions from both models are quite similar.

Figure 5. Predictions made by the spline-based convolution approach (in orange) together with the parametric fitness-fatigue model’s predictions (in blue).

T here was trial and error in the knot placement above. An alternate approach was to put a knot somewhere between the first few weeks and the final boundary knot around 160 days. Figure 6 shows that this leads to poor results.

Figure 6. Effects of a bad knot placement for the fitted impulse responses. Knots are represented by the crosshairs on the green horizontal line.

An explanation is that a time lag of 80 days is barely observed twice in the study, whereas smaller lags from training to performance are observed much more frequently. Putting a knot out at such a low information point doesn’t do our analysis any favors, and raises a warning about this spline-based convolution method: in a truly exploratory situation (without a parametric model to guide us), an unfortunate knot placement might lead us to an unrealistic impulse response. It is wise to pay attention to the amount of information available over the time lag axis.

R eal data always leads to a different experience than its simulated counterpart, and we’re lucky that the experience was as similar as it was. We did detect both fitness and fatigue effects for the swimmers, though the strength and time scale of the effects varied considerably. Standard errors suggested a high level of uncertainty in many of the parameter estimates, and our trial-and-error approach to starting values and knot selection adds additional procedure variation that is not easily quantified.

We transformed the raw performance variable, meters per second of velocity, into the Criterion Point, or Cp, scale. This introduced an interesting nonlinear transformation and created a sense of continuity from Parts I and II even as the sport changed. We made strong assumptions about the values a , b , and L , but those seem unlikely to qualitatively change the analysis. Only minor adjustments had to be made to handle “missing” performance data at the daily level.

In Part IV, Modeling Cumulative Impact will finally move past convolution-based features and towards a the use of dynamic forms. This will bring us closer to standard time series approaches. In the first stage of this journey, cumulative impact will be accommodated by use of a latent state vector , making the Kalman Filter an excellent tool for handling the fitness-fatigue model in state space form. It will also motivate us to leave R for Python as we make use of the MLEModel class of the statsmodels toolkit.

[1] D. Kolossa, M. Bin Azhar, C. Rasche, S. Endler, F. Hanakam, A. Ferrauti, and M Pfeiffer, Performance Estimation using the Fitness-Fatigue Model with Kalman Filter Feedback (2017), International Journal of Computer Science in Sport

[2] T. Busso, R. Candau, and J. Lacour, Fatigue and fitness modelled from the effects of training on performance (1994), European Journal of Applied Physiology

[3] R. Morton, R. Fitz-Clarke, E. Banister, Modeling human performance in running (1985), Journal of Applied Physiology

Modeling Cumulative Impact Part III

Modeling Cumulative Impact Part III

Model the performance of five elite swimmers based on past training intensity

Recommend

GitHub - jfim/org-jira: org-jira Spacemacs layer

Schwarzkopf 施华蔻怡然无氨植物染发霜（多色可选） 19.8元包邮（前2600件）_天猫精...

你因为什么粉上了朱一龙？ - 知乎

谁说视觉中国完蛋了？反正财报挺好看

GitHub - awssat/laravel-blade-audit: ? Extensive information about blade views i...

希特勒知道日本偷袭珍珠港之后是什么反应？ - 知乎

携程将与Naspers股权置换，成印度第一大OTA最大股东

用好这些实用模板，把 Notion 打造成全能助理

一季度区块链应用报告：金融、政务仍是主战场应用向更多行业开枝散叶

科学家测量氙 - 124 的半衰期

About Joyk