Skip to content

2:00:25: Eliud Kipchoge’s lesson in endurance physiology (or biomechanics, or psychology, or nutrition, or…)

May 6, 2017

Earlier on this morning, we witnessed a unique and surprisingly exciting time trial over 26.2 miles in Nike’s Breaking2 project, now being rebranded as #VeryNearlyBreaking2 (allegedly). I thought that, since almost everyone else will, I’d give my thoughts on the event. First, two confessions/declarations: first, my old PhD supervisor and close collaborator Prof Andy Jones was a consultant on this project, as was Dr Phil Skiba who I have got to know very well since he started working with Andy’s Exeter group. Second, I only watched the last 25 km of the event because it was a Saturday and wasn’t going to get up at 4:30 am to watch the first half of a marathon in which almost nothing happens.

I had no inside information about the even, mostly because I didn’t pry, other than the odd “how’s that thing going?” to which the reply would usually be “fine” or “interesting”. Most of what I could glean either was or now is in the public domain, largely thanks to the twitter feeds of Phil and Andy and the work of Alex Hutchinson. But the event itself was, for me, a lesson in the physiology of endurance running and the acute effects of prolonged high-intensity effort. As the post’s title implies, there are plenty of other lessons to learn from this, but this one is mine.

Kipchoge completed the 26.2 miles in a blistering 2:00:25, by ~2.5 minutes the fastest a human has ever run the distance.  At 25 km, he was bang on schedule, but in the last 10 km he lost time, and lost about 15-20 s in the last 5 km alone. Why could he not hold that pace in spite of all of the assistance from the arrowhead of pacers (or the car)? To understand that, I think we need to look at what it takes, physiologically, to run a marathon. We hear a lot about maximal oxygen uptake, lacatate threshold (LT) and running economy, as these largely place limits on what is possible. The maximal oxygen uptake is the size of the engine, the LT is the rev limiter (sort of) and the running economy is the miles per gallon (or, in this case, per gram of glycogen) that the engine can offer. What is less well appreciated is that there is not one “rev limiter” in physiology, but two. Running faster than your LT has implications for endurance because you develop a “slow component” of oxygen uptake which increases the oxygen cost of exercise (it reduces economy, in other words). But the slow component can be stabilised provided you run at a speed slower than what is called your “critical speed”. This is the second rev limiter, and I’m going to argue that it is crucial to Kipchoge’s efforts today.

The critical speed, simply defined (!) is the speed asymptote of the speed-duration relationship (analogous to the critical power in the power-duration relationship). Andy Jones and I have written about these concepts in the scientific literature here and here if you want more detail on the definition.  The key point for this post is that a 2 hour marathon effort requires a running speed that is necessarily below the critical speed, because above this point, the slow component of oxygen uptake cannot be stabilised, leading rapidly to the attainment of maximal oxygen uptake and inevitably to task failure.  But the sustainable pace for a marathon run this fast must have been very close to the critical speed (see Jones & Vanhatalo (2017) for more on this). Thus, Kipchoge’s task was to run as close to his critical speed as possible, and stay there. For the most part, he suceeded.

Kipchoge looked in full control and on pace for much of the Breaking2 effort. But if Kipchoge ran just below his critical speed, why did he slow down in the last 5 km? Well, a normal runner running above LT but below the critical speed is operating in their “heavy domain”, where we strongly suspect glycogen depletion plays a key role in determining exercise capacity, one way or another. The presence of a slow component of oxygen uptake increases the demand on fuel reserves, which probably explains why most studies show marathon efforts being completed just above (but not too far above) LT. We mere mortals cannot get close to critical speed during a marathon lasting 2:30-4:00 hours, but there is good reason to suppose elite athletes can. One reason for that is an apparent “domain compression” in which the LT and critical speed both occur at a very high fraction of maximal oxygen uptake. Another is that elite runners have phenomenal fatigue resistance, in part due to the high percentage of type I (slow twitch) fibres in their muscles.

The above considerations provide a reason for Kipchoge’s basic speed, but not for his slowing down. Obviously, the distance itself places a severe strain on fuel stores, but he was still running exceptionally quickly towards the end, albeit grimacing. He didn’t look like he blew catastrophically. He lost some of the advantaged of drafting as the “arrow” collapsed and the car gapped all of the runners, but again this would probably have a minor influence since this only happened in the last few laps. The effect of heat and dehydration were also probably minimal as it wasn’t hot and he was regularly drinking. That leaves few possibilities for his slowing down. Undoubtedly there would have been some muscle damage at this stage, and we know that progressive, slowly-developing fatigue occurs in the heavy domain. However, the slowing in running pace was not progressive, which leads me to conclude that his critical speed itself may have decreased.

The above idea seems at odds with what we have seen experimentally: fatiguing exercise and glycogen depletion does not seem to alter the critical power in cycling, but these experiments are nothing like prolonged exercise performance in elite athletes.  I have heard anecdotal reports of a diminished critical power at the end of cycle stage races or long-duration time trials. If true, it would explain the fall-off in pace without catastrophic failure.  What makes the 2 hour marathon such a challenge is that it is close to the limit of what we think humans can sustain, and the effort itself likely reduces that sustainable pace in the last 10-15 km.

Kipchoge is the first to treat the marathon like a 2 hour time trial and hold it together for most of the distance. If he can find a way of holding it together for the full duration, an athlete of Kipchoge’s talent really could break 2.

I want my country back! Personal thoughts on the EU Referendum

June 3, 2016

I want my country back. It’s a refrain we’ve heard a lot in the last 5-10 years. It’s generally a call to restore some sense of what being British really means, of what Britain should feel like, and the people Britain should be composed of.  What has made this refrain so commonplace is the sense that Britain has lost its sense of identity to an influx of immigrants from the EU and elsewhere.  If only we could get shot of that damned institution we’d be able to get our country back.

My father-in-law is an ex-pat who lives in France and drives around in a French car. Every once in a while he drives over to see us.  On one such occasion he drove it to a local supermarket in Strood, Kent.  At a set of traffic lights he stopped, and a man ventured towards his car, as if he was going to ask directions.  As my Father-in-law opened the window, he was greeted with “YOU FRENCH BASTARD!” at which point the lights went green. As a result he was unable to explain that he’d lived most of his adult life in Oxfordshire.  It’s an amusing anecdote, but underneath it is something far nastier: Britain has become increasingly hostile to foreign people in the last few years, and sometimes this boils over into xenophobia and racism.  A colleague of mine from Italy who has lived in the UK for a decade is very clear that this xenophobia is relatively new.  It is only in the last few years that strangers have urged him to “fuck off back to your own country” whilst he is walking down the high street.

Immigrants are the modern day bogeymen, easy to label and criticise, but much more difficult to understand. The media and some politicians have effectively dehumanised anybody who isn’t British and use the term “immigrant” to imply that anybody who comes here does so to suck the life and soul out of Britain.  UKIP can take much of the blame, but the Tory tub-thumping on immigration, and Labour’s complete inability to offer an alternative voice has made Britain a very unwelcoming place in the last few years.  I think that the effect of this is much more corrosive than the odd idiot shouting obscenities in the street.

A fact often lost in the EU debate is that EU migrants contribute very significantly to the British economy and British life.  They contribute far more in expenditure and taxation than they claim in benefits. Many of them are highly skilled and would be difficult to replace post-Brexit.  Most sensible politicians know this, which is why they usually talk of immigration numbers, points systems and the like, without actually being positive about EU migrant contributions (that would make them look human, you see, and we can’t have that).  But the corrosive effect of all of this is that these migrants, given the choice, will probably leave the UK post-Brexit, and our country will be much the poorer for it.  This is not because there will be any particular policy that makes them leave, but because the clear attitude of a Britain that votes to leave is that we don’t want to work with YOU.

Although it could be argued that we are just trying to extract ourselves from the political project that is the EU, the truth is that the EU migrants I know have been mulling over whether to stay in the UK or not for a few years now. This has been ever since UKIP gained MPs in the House of Commons, one of whom was briefly the MP in my constituency.  With Brexit now a distinct possibility, one colleague, a Dutch national, is virtually resigned to returning to the Netherlands if it occurs because Dutch citizens are not allowed to adopt dual nationality.  To continue to work in the UK post-Brexit would require a work permit of some kind, and all of the hassle and uncertainty that goes along with it.  Others have cited likely restrictions on work and travel for them and their family members as reasons to leave.

But I’m now going to avoid calling them EU migrants. Instead I’m going to call them what they really are: family and friends.  My niece and nephew are both Italian nationals, and I have French, Belgian, Finnish, German, Italian, Greek, Dutch and Polish colleagues in my institution and elsewhere whose lives will be seriously compromised, if not destroyed entirely, in the event of Britain leaving the EU.  These are Brilliant, talented people who deserve far better than to be the collateral damage in what is, and always was, the Conservative Party’s internecine pet project.

So yes, I want my country back. A country that is welcoming, tolerant and outward looking. And to get that country back I’m voting Remain.

Dear Alex: Conversations on the way to school

May 31, 2016

This post is based on a genuine conversation I had on a walk to school with my son (he is 4). I’m posting it here because I think it is as good an advert for having children as anything. Children are amazingly original thinkers even if their view of reality is not always (if ever) quite right. I hope my son keeps his current enthusiasm for finding things out.  Here goes:

Dear Alex:

Our trip to Madame Tussauds clearly made quite an impression on you, judging by our conversation on the walk to school. I feel I need to clear a few things up:

The guy with the feather was William Shakespeare. The guy who looked like a scarecrow in the bit that smelled of poo was a plague doctor. The plague is not “The Black of Death” as you keep saying (you’re close though), and you can’t get it because you sneeze a lot. We have drugs and sanitation now, so doctors don’t need to dress like scarecrows.

The guy with the eye patch and the “pirate hat” was Lord Nelson. He has an eye patch because he hurt his eye in the Battle of the Nile. He is NOT a pirate. In answer to your other question, I don’t know what his friends called him but it was probably something like “Horatio”. He definitely wasn’t a pirate. I know that I wasn’t born when he was alive, but that does not increase the probability that he might have been a pirate just because you think he was. And you don’t win this argument by saying that you know he was a pirate because “Lord Nelson is my brother’s dad”. Even if you had a brother, this would be chronologically and biologically implausible. Whilst he has an eye patch and a hat that makes him look like a pirate HE IS NOT A PIRATE. Nelson has a column too, but that doesn’t make him a journalist.

If you want to continue this line of reasoning, I am quite happy to set up an anonymous Twitter account and sign you up to a few internet forums. You’d excel at trolling.


“This oxygen uptake value doesn’t look right”: on interpreting exercise tests in athletes

December 15, 2015

This post is inspired by a discussion between Antoine Vayer, Jeroen Swart and myself a few days ago on Twitter.  Vayer is a former coach of the Festina cycling team, and a strong advocate of interpreting power output data in the context of doping.  Swart is an exercise physiologist and sports physician who has been involved in the testing of Chris Froome at GSK (or embroiled in the testing, depending on how you look at it – the social media reaction has been quite something.  I can’t put my finger on quite what kind of something it’s been, but it’s something nonetheless).  I noticed that Vayer had put up some data from an incremental cycling test, with the following challenge to “experts”:  “Game 1/10 for experts from Lisbeth ! Who got this VO2 [oxygen uptake] >91 ml/mn/kg ? Is he a cheater or not ? Is it possible ?”.  Now, I consider myself something of an expert here.  In fact, I’d say that the number of people in the world who understand the VO2 response to exercise better than me could comfortably fit in a double-decker bus, and some of them are dead.  So I had a quick look at the data.

The maximal oxygen uptake (VO2max) value was recorded during an incremental test in which it appears that the athlete exercised for about 4 minutes each stage and rested between stages, with gas exchange data recorded every 30 seconds.  The 91 mL/kg/min VO2max was recorded towards the end of the penultimate stage displayed in Vayer’s tweet.  But there was something odd about it. The VO2, in absolute terms was 6.29 L/min, but this value was achieved at a power output of 425 W.  Any exercise physiologist faced with a high VO2 will naturally enquire about the power at which it was achieved, or if faced with a high power output, will enquire about the VO2 achieved.  The first thing you learn (or should learn) when analysing test data like this is to ask the question “does it look right?”.  If the data deviate wildly from what is considered normal, it’s probably wrong for some reason.  This works for athletic data as well as any other because there are robust relationships between VO2 and power output.  Vayer’s data make no physiological sense.

The VO2-power output relationship follows this rule-of-thumb: for every watt of power produced, you consume about 10 mL of oxygen in the steady state (give or take 1 mL/min/W).  This makes for a very handy error detector in the lab.  This works by taking any absolute VO2 value (in this case 6.29 L/min), subtracting ~0.80-1.00 L/min to account for the O2 cost of pedalling, and dividing the answer by power output.  In this case, 6290-1000 = 5290/425 = 12.4 mL/min/W (this figure is known as the “gain” for VO2).  In other words, there appears to be a very large error in VO2 of about 1.0 L/min.  The actual VO2 that should be associated with a power output of 425 W is ~5.25 L/min (assuming a baseline pedalling VO2 of 1.0 L/min, which at a cadence of 92 rpm seems reasonable).  It may even be lower than that in an efficient athlete. Jeroen Swart estimated 4.95 ± 0.15 L/min VO2 for the same power.

By way of contrast, Chris Froome’s recent data produces a gain of ~9.4 mL/min/W (5.91 L/min, less 1 L/min, so 4910/525 = 9.4).  This is a normal VO2 gain, and may even be an underestimate given that this was measured during a ramp test, in contrast to the 4 minute stages used by Vayer (quite correctly if steady state VO2 was an important variable to measure).  In Froome’s case, the 30 W/min ramp is non-steady state and VO2 will lag power output.  Additionally, if Froome had a plateau in VO2 (or was approaching one), this would have reduced the gain still further.  Thus, it is reasonable to suppose that his steady-state VO2 gain would be higher than 9.4 mL/min/W. In all likelihood, it would be very close to 10 mL/min/W.  Why, then, is the VO2 recorded in the test in question so high?  There are four possibilities: 1) an extravagant metabolic response by the cyclist; 2) an error in standardisation of ambient conditions; 3) an ergometer calibration error, or 4) an error in the flow or oxygen sensor and/or calibration.

Extravagant metabolism

It is possible that the relationship between VO2 and power output can be altered by exercising at high intensities for prolonged periods.  This increases the VO2 gain, and values >12 mL/min/W have been observed in the literature.  So the “slow component of oxygen uptake” could drive the VO2 above the predicted steady state value and increase the gain.  I have done a bit of work on the slow component, and I think it may be a factor in this test.  It is, however, unlikely to explain the VO2 being 1.0 L/min above expected.  This is because, as its name suggests, the slow component takes time to express itself.  It takes 90-120 to emerge from the “normal” or “fast component” of the response, and if you fit a curve to it, it has a time constant of at least 200 seconds.  It therefore takes many minutes to develop, and in this test many minutes we do not have (nor does the VO2 does seem to be systematically rising in the 425 W stage).  It is also unlikely that previous test stages generated much slow component behaviour.  This is again partly due to their length, and partly due to the modest blood lactate concentrations measured (2.1 mM at 375 W).  The slow component only develops above the lactate threshold, and in the 375 W stage the cyclist is only just above it.  Any slow component would be small even if the 375 W stage was continued for 6-8 minutes.  Finally, the denominator of the VO2 gain equation is against us: the largest slow components recorded in the literature are for exhaustive exercise bouts lasting 10-15 minutes, and they can exceed 1 L/min.  To achieve this in the time required in this stage would need VO2 to rise at an unbelievably swift rate, and we just don’t see this happening.  So, whilst the slow component of VO2 is real and significant, it does not appear to be an explanation for the high VO2 values reported.

Ambient conditions and gas exchange calculations

Pulmonary gas exchange variables must be corrected from ambient temperatures and pressures to standardised temperatures before interpretation.  These days using automated systems these calculations are done automatically, but sometimes input is needed as part of routine calibration.  For ventilation, values are expressed BTPS – Body Temperature and Pressure, Saturated – because the true volume exhaled at the time matters.  For VO2 (and VCO2), the correction provides Standard Temperature and Pressure for Dry gas (STPD).  The only way to introduce error here is to fail to change the ambient temperature settings prior to analysis.  However, a 10°C error in temperature (unheard of in a well-ventilated or an air-conditioned laboratory) would change VO2 by no more than ~0.3 L/min.  A similar error would occur if you got barometric pressure wrong by 30 mmHg.  Thus, it is unlikely that a 1.0 L/min error would be introduced here.

Ergometer calibration

A cycle ergometer that is not calibrated can produce very strange VO2 responses.  I vividly remember taking delivery of an ergometer that had not been stored correctly and its flywheel was off by a few millimetres.  Being electrically-braked, a few millimetres is huge, and I was exhausted at a power output 150 W lower than normal with an (apparently) enormous VO2 gain.  But my VO2max was in the normal range, so it didn’t seem to be the gas analyser, and it just felt wrong.  In the case in hand, it is the VO2 that seems too high, not the power.  And an elite cyclist would also very quickly realise if the ergometer was out. As an example, I once started a treadmill test on an athlete who got quite animated about how hard the 10 km/h warm-up felt.  It turned out that somebody had changed the settings to miles per hour!  The ergometer is not likely to be the problem in this case.

The gas analyser

The origin of the error can be limited to the gas analysis system.  I am not sure whether the Oxycon system being used in this test (as Vayer told me it was) was being used in a mixing chamber mode or breath-by-breath, but it seems that the ventilatory variables look normal: minute ventilation is not astoundingly high for an athlete producing 90 mL/kg/min (if anything it is too low).  Indeed, the VE/VO2 ratio at maximal exercise is usually >35 in my experience.  Here it is ~32.  Not that low, but low all the same.  In short, the flow sensor or ventilatory volume measurement is not a strong contender for the extra litre/min of VO2.

This leaves us with the O2 sensor itself.  The origin of the error is impossible to pin-down without knowing the precise specifications of the analyser (there are a number of Oxycon models, some which use fuel cell type analysers, others paramagnetic sensors), but it is possible that an electrochemical fuel cell analyser had reached the end of its life at the time of the test and started reading high. Alternatively, a calibration error resulting in incorrect zero and/or span calibration could have caused a systematic error in VO2.  It is important to state that this error is not peculiar to the 425 W stage: the 375 W stage preceding it produces gain values of around 12 mL/min/W , so this error is evident throughout the test [in a previous edit, I said 14.5 mL/min/W – I’d forgotten to subtract baseline VO2 in the calculation.  Mistakes are easy to make with gas exchange data!].  A calibration error on one of the calibration points would amplify the erroneous gain at lower power outputs, wherein the expired O2 fraction (FEO2) would be lower than during maximal exercise (that is, the error will get larger or smaller as FEO2 falls away from 20.95%).  That we are not seeing this means that the whole calibration curve is systematically in error.  The cause of this (analyser ‘drift’ or fuel cell end-of-life performance) is impossible to call without data from the calibration procedure itself.

In conclusion, I’m overthinking this.

Or, to put it another way, the issues above illustrate why physiological testing is unlikely to ever be a major pillar of anti-doping efforts: there are too many sources of error, as well as too much variation in testing protocols, the equipment and ergometers used between labs.  Anybody who has attended a conference on sports physiology will appreciate that there are almost as many measures of “threshold” as there are people working in the field.  Getting scientists and practitioners to agree measurement standards seems a very long way off.  Even if such standards could be agreed, there are no clear physiological “red lines” above which doping can be inferred.  This is because athletes, like all humans, occupy a normal distribution of physiological function.  More correctly, the parameters of endurance performance (be they physiological, biomechanical, psychological) are all normally distributed, and the sum of these distributions makes the athlete who they are.  Doping can and does shift some of those curves, but from where to where? For specific individuals we simply don’t know most of the time, and until we are sure a change in physiological test results are not due to errors we inadvertently introduce, we never will know.


So with that, Merry Christmas…

Reigning Blood: Thoughts on the Scandal of Blood Doping in Athletics

August 22, 2015

I was asked if I’d write a thing about doping for a website, so I did but the editor never got back to me.  Here is what I wrote:

This year the World Athletics Championships is being held in Beijing. But there is no doubt that this event is clouded by a doping scandal that threatens to seriously damage, if not destroy, the sport’s credibility.  The scandal was the result of two news reports.  The first concerned the allegation that a high proportion of Russian athletes were involved in systematic doping.  The second was the Sunday Times/ARD programme and publication in the first weekend of August which leaked extracts from an historic blood profile database held by the International Amateur Athletics Federation (IAAF). When these profiles were analysed by experts in biological passport analysis, the results suggested that as many as 1 in 7 athletes from 2001 to 2012 had returned “suspicious” values, tainting many major championships, including the 2012 Olympic Games.  Clean athletes have lost medals, failed to make finals, and lost funding because they were out-competed by athletes who were doping.

The reaction to the leak has itself been revealing: the leaders of the IAAF have come out fighting against the findings and denying any wrongdoing; the experts concerned have countered these arguments. Finally, social media platforms have been ablaze with comment and speculation about which athletics star will be implicated next.  This scandal can, I think, be better understood if the scientific basis of the allegations is understood. Furthermore, the reaction of those at the top of sport and anti-doping, as well as the public reaction to the scandal, tells us a lot about how a future scandal such as this might be avoided.

What does “a suspicious profile” actually mean?

The athlete biological passport (ABP) is a method used to detect doping, not by finding a drug in blood or urine samples, but by finding evidence of doping in biomarkers such as haemoglobin concentration.  The ABP works by an individual athlete regularly giving blood samples, from which a profile is generated with individual limits (“probability thresholds”, currently 99%) set to identify abnormally high or low values. A “suspicious” profile is one which exceeds these limits at some point or points, or which contains deviations or fluctuations that are identifiably abnormal (these are known as an “Atypical Passport Findings”). An expert reviews the profile and uses other data (if available) to confirm that it is not likely to be due to pathology or specific circumstances that might affect blood sampling (e.g., illness).  If those other data are not available, the result remains “suspicious” and follow-up data collection and additional testing can be performed.  The “suspicious” profiles noted by the Sunday Times/ARD experts were those exceeding a 99% probability threshold.  This is consistent with the current WADA guidelines.

The reason the “suspicious” profiles are not direct evidence of doping is that the ABP requires further review for this to occur. A suspicious result warrants further review by two independent experts if the first reviewer has ruled out normal physiology and pathology. Only if the three reviewers unanimously agree that the use of a prohibited substance or method is highly likely and illness or other factors are unlikely to account for the results can an “adverse passport finding” result be declared, which then initiates disciplinary procedures against the athlete.

It is worth pausing here briefly to make the point that ABP findings are considered “atypical” and “suspicious” when there is not enough evidence to be certain that things other than doping could explain the finding. Results only become “adverse” (that is, the athlete can be sanctioned) with further evidence and review.  The data on which the scandal is based are atypical, NOT adverse findings.

The rigour with which the ABP is administered and managed makes it sounds perfect, but it isn’t.  As with standard doping control tests, the ABP is designed specifically to avoid false positives (that is, you want to go out of your way to avoid sanctioning a clean athlete).  This means that the false negative rate is high. In other words, cheats can and do avoid sanction (more on this later).

The issue the experts in the Sunday Times/ARD story took with the IAAF was that suspicious findings did not seem to be followed up.  The IAAF did have procedures in place to do so.  The allegation that they did not is, therefore, very serious. I am in no position to say whether this is true or not, but on other points of the story I have some sympathy for the IAAF.  Much of the focus in the original story was on the World Championships in Helsinki in 2005.  But the sanctioning of athletes using the ABP did not come into force in athletics until 2009.  Judging the actions of the IAAF by the standards of today seems a little unfair.  That said, Dr Michael Ashenden, one of the experts consulted and an architect of the ABP itself, makes a compelling argument that the IAAF could and should have done more at the time in any case.

The good, the bad, and the ugly

The reaction the scandal has been an interesting exercise in sports politics and the power of social media.  WADA stated that it was shocked and within a week had announced an investigation into the allegations.  The full scope of this investigation is not clear, but it is likely to focus, in part, on how the data were disclosed rather than how the IAAF responded to the blood profiles pre-2009. Lord Coe, now president of the IAAF, came out fighting, suggesting that war had been declared on athletics, and dismissing the analysis of the “so-called experts”.  This was likely to be part of his successful attempt to get elected to the presidency, but it was also an exercise in how not to manage a crisis.  The experts he maligned really were experts in this particular field.  They have diagnosed that his sport is sick, and he needs to listen to their advice.  More broadly, in the post-Armstrong era, having an international federation “circle its wagons” in this way was bound to lead to accusations of cover-up, whether true or not.

The IAAF’s bullish reaction to the scandal led to commentators on social media in particular savaging the IAAF.  This is understandable, given that these commentators are already extremely hostile to any official narratives in the wake of the Lance Armstrong case and the role the UCI played in it.  They were also primed by a Tour de France in which suspicion of the CG contenders grew exponentially in spite of attempts to prove innocence. It was to be expected that they would turn their guns on athletics, and several high-profile athletes have been the subject of intense speculation as to whether their ABP profiles are suspicious.  Some of that commentary was thoughtful, detailed, and nuanced, but even the most cursory search on Twitter shows many high-profile athletes being very seriously libelled with doping allegations.

Given recent cases, such as McAlpine vs. Bercow, naming names without evidence on social media is extremely ill-advised.  The acid test here is: has an athlete been mentioned by name? Has doping guilt been implied, even indirectly? Does the athlete in question have an adverse ABP finding? If the answer to the first two questions is “yes”, and the answer to last question is “no”, then the athlete has been libelled, in the UK at least.

How should the IAAF deal with this in the future?

It is important to remember that the goal of anti-doping is to protect clean athletes.  This is difficult because proving a negative in systems shrouded in secrecy is almost impossible.  Some form of transparency that federations and athletes can buy into would be a start.  Some athletes have chosen to make their passport data public, whereas others have not given their consent. This is already leading to assumptions that lack of disclosure means that the athletes in question have something to hide.  This might be true, but it may also mean that the athletes possess passport profiles that contain atypical findings that are explicable, but they are concerned that key contextual details may be lost or simply dismissed.  It is also possible that the explanation for atypical contains information of such a personal nature that the athlete does not want any of the information in the public domain.  All that considered, the IAAF and WADA urgently need to build transparency into their systems.  One way would be to periodically, or on request, perform a full three expert review to give an athlete a clean bill of health (or otherwise), and publish the resulting ABP Document Package with the athlete’s consent.

Finally, there is the future of anti-doping per se.  If we have learnt anything from the Armstrong case, it is that analytical anti-doping measures will only go so far.  What did for Armstrong was tenacious investigative journalism and equally tenacious law enforcement input, followed by a forensic investigation by USADA.  In this regard, catching cheats isn’t just about blood.  If anti-doping policy was a pheasant shoot, doping controls and the ABP would be akin to beating the ground and sending in the dogs: to be successful, you still need the people with guns in the end.  WADA and the sports federations covered by the code need to use all necessary means to catch doping cheats, because the current systems are not protecting clean athletes in the way they should.

Data transparency in cycling: necessary, utopian, and a complete can of worms

July 19, 2015

This year’s Tour de France is developing into a bit of a split race, being both exciting by stage and predictable by General Classification (GC).  This was most clearly demonstrated by the blistering performance of yesterday’s stage winner Steve Cummings of MTN-Qhubeka (the African team’s first stage win, on Mandela Day, no less), followed by Chris Froome hoovering up all attacks against him.  It was an eventful ride for Team Sky, with fists, saliva and urine apparently being thrown at them.  They are currently the sport’s bad guys, for no reason other than dominance.  The last team to dominate like Sky did was one of the liveries led by Lance Armstrong, and Sky’s tactics and public relations stance continue to draw uncomfortable parallels with the Armstrong era.  This suspicion has led to calls for Sky (and others) to be more transparent about their power data in particular, since the view goes that teams with nothing to hide should hide nothing.

Something something Armstrong, something something Froome. Right, let’s SCIENCE… [Forget personalities, there’s a link in two paragraphs time in which the awesome David Wilkie uses very simple power modelling to make a bicycle fly.]

Power output and the physiological response to exercise

Mountain stages in the Tour are critical to success.  One bad day in the mountains can cost you the race, and a good day can get you a Yellow Jersey.  In contrast, sprint stages rarely produce gaps in the GC, and time trial stages are predictable and (to within a minute or so) run to form.  That’s problematic if you’re on the wrong side of the minute, but not fatal.  Gaps of over 5 minutes are sometimes seen in the mountains.  In a mountain climb, where air resistance plays a more limited role, the rider who can sustain the highest power, and thus (bike and body mass accounted for) the highest speed for the duration of the all the climbs, is likely to win the Tour.  Time trial specialists cannot win the Tour with time trials alone.  They must train to climb (contrast Boardman’s Tour performances with those of Wiggins – riders with very similar initial backgrounds but very different training approaches to the road).

Because sustaining a high power output on a climb is crucial, there has been a great deal written about the limits of what is possible.  I am not going to add to this debate, as in my view without direct measurement of power as well as an understanding of the rider’s physiological capacities (aerobic and anaerobic) there are too many assumptions to be sure that a conclusion about whether something is possible or not can be drawn.  There are performances that might look suspicious, but a 4 min mile performance in running would have looked suspicious in 1935.  By 1995 it was considered slow.  We do, however, know a few things about what determines sustained power, thanks to scientists like AV Hill, David Wilkie, and a number of others.

The physiological response to exercise depends on the power output you produce.  For “moderate” exercise, muscle oxygen uptake rises rapidly and reaches a steady state.  Blood lactate either does not rise or rises only transiently.  At these work rates, exercise can be sustained for many hours.  For “heavy exercise”, when you exceed the lactate threshold, oxygen uptake takes longer to stabilise and does so at a higher value than you would predict from steady state responses to moderate exercise (in other words, you are less efficient).  This is the result of a “slow component” of the oxygen uptake response that develops after about 2 minutes of exercise and stabilises after 15-20 minutes.  In the heavy domain, exercise can be sustained for between 45-60 min and about 3-4 hours.  For “severe” or “high-intensity” exercise, the oxygen uptake slow component does not stabilise (and nor does any other metabolic response) until maximal oxygen uptake (VO2max) is attained.  Exhaustion inevitably follows soon after this occurs. The “severe-intensity domain” commences when you exceed the critical power (CP).  The CP, in turn, represents the asymptote of the power-duration relationship, first noted by AV Hill in 1925.  We’ve written a few papers about these concepts, which you can find here (free) and here (not free).  The power-duration relationship can be defined by as few as two parameters, namely the CP and a parameter to define the shape of the curve, denoted W’.  The CP is thought to reflect the power of the aerobic systems of energy delivery and the W’ is thought to reflect the “anaerobic capacity”, although we know this is a little simplistic. It is the power-duration relationship that is important for working out what is and is not possible when cycling up a hill.

Defining possible and impossible

If you know the values of CP and W’, and you know the power demand of a task, you can make a clear prediction about what the time limit of the task is. The equation, for those who want it, is:

Time limit = W’/(power output – CP)

The problem is that the above parameter values will vary between athletes and will vary day to day.  The parameters and the underlying physiology that determines them can also be influenced by various acute interventions (like glycogen depletion, for example), which adds further uncertainty to any “back of the envelope” calculations that you might wish to make. To know if any performance is abnormal, you need to know what the power-duration parameter values actually are. Consider that in an elite cyclist with GC ambitions might have a CP of about 380-440 W, and a W’ of 20-30 kJ, both of which will depend, to some extent, on body mass.  This means that to complete an effort lasting 40 minutes, with a W’ of 25 kJ and a CP of 420 W, the “normal” power output sustainable for this duration would be 430 W, or 6.1 W/kg for a 70 kg rider.  Notice here that the contribution of the curvature constant to long duration efforts is quite small (about an extra 10 W over 40 min) and thus the most crucial determinant of mountain performance is the maximal sustainable power output, CP.

One reason why I don’t think fixating on a particular W/kg value as “possible” or “suspicious” really works is that it all depends on the value of CP.  This value is unknown and variable!  Obviously, for a 70 kg rider to sustain 6.1 W/kg without drawing on the W¢, CP would need to be at least 430 W.  I don’t think that is unreasonable, given previously documented hour record performances and the power outputs produced during them (Bassett et al., 1999).  To sustain 430 W would require an oxygen uptake of approximately 5.3 L/min (O2 cost of ~10 mL/min/W, plus ~1 L/min for the O2 cost of spinning the legs at 90-100 rpm), which, if capable of utilising ~90% of VO2max, would predict a VO2max of 5.8 L/min or 84 mL/kg/min.  This is high, but certainly not unheard of.  Sustaining 85% of these figures would require a VO2max of 89 mL/kg/min.  That is still not impossible.  And this is all assuming a normal mechanical efficiency.  Efficiency would decrease due to the development of a slow component of oxygen uptake, but this would add no more than about 200 mL/min to the tally (that is, oxygen uptake would remain submaximal even with this factored in).

Knowing the possible

The above calculations are hypotheticals based on reasonable estimates.  The numbers accompanying Froome’s (or Nibali’s or Contador’s or…) that appear on the internet are just as hypothetical.  In short, we have no direct numbers for either physiological capacity or performance for GC riders at the time of the Tour.  Values estimated from those recorded in other parts of the season are likely to underestimate the capacity of a rider who has peaked and ridden conservatively in much of the first week of racing.  In addition, direct measures of rolling resistance, wind speed, temperature, altitude, and so on, are also absent.  To know what’s possible would require direct power-duration measurements from Froome immediately before the Tour, as well as calibrated power data during each and every stage.  It is likely that Sky possess both data sets.  They most likely have a variety of physiological measures that could corroborate the power-duration data (i.e., the VO2max, efficiency and LT data would likely fit in the same general picture).  But they refuse to place these data in the public domain.  Should they?  The scientist in me says yes.  The sports fan in me says maybe.  The pragmatist in me says that there is next to no chance of these contemporary data ever seeing the light of day.

For one thing, Froome’s consent would be needed to release these data, and even if that consent was given, where would the data be stored and how would access be gained?  If Froome releases his, every rider in the Peloton should be obliged to release theirs, lest there be any accusations of unfair treatment.  The teams are highly unlikely to want to do this for competitive reasons.  It’s much the same reason why Formula 1 teams do not release telemetry data in real time – a good rival engineer would identify engine modes, brake balance, tire wear etc and use that information to the team’s advantage.  The sport would become even more about who has the best support crew rather than the best performer.

A less problematic point is that not all teams use the same power-measuring devices.  Moreover, where on the bike the power is measured also matters.  Although often measured at the crank or the rear wheel hub, it’s the power transferred to the road that counts (producing forward propulsion), but the power actually produced on the pedals that costs (in terms of physiological demand).  Frictional losses and rolling resistance (though presumably minimised) will also differ, adding errors to any calculations of who produced watt, where and when…

The Future

There has been some chatter on Twitter and elsewhere of power files from races being used as part of the Athlete’s Biological Passport in cycling.  I can see some merit in this, as within each athlete, performances can be compared to their power-duration relationship, their physiology and their blood parameters already used.  The observation of abnormal power outputs alongside sudden changes in, for example, the Off-score, might trigger closer scrutiny of that athlete in the coming months.

Finally, I can see the potential for power data from grand tours being released following an agreed embargo period.  This would serve an educational and scientific purpose of providing a rich seam of data to be used by anybody who wanted it.  Those data could also be used as part of a retrospective anti-doping case.  But they’d only ever be part of the story.  If there was reasonable circumstantial evidence of doping in the absence of a positive test (like the Armstrong case, for the most part), then power files could weight to that case.  But it would only be small given the number of variables involved in ultimately producing power output.

I’ve almost certainly not done this issue justice, but the above thoughts lead me to conclude that the question of data transparency in cycling and what its potential uses are does not have any easy answers.


The contentious and irrelevant bit:

Armstrong’s shadow

The similarity between Armstrong-era cycling and today ends with what is written above.  Quite a few people have asked David Walsh, the man who was instrumental in taking down Armstrong, why he is not asking Sky and Froome tough questions.  I personally think that is wrong-headed.  Armstrong’s transformation post-cancer was mind-blowing, whereas Froome’s ascent has been more incremental.  Add to that the accumulation of damning evidence throughout Armstrong’s career, covered up by Armstrong with the help of the UCI, and in that case Walsh could ask questions about tangible things in Armstrong’s closet.  Froome’s closet is bare by comparison, save for a TUE and some stunning performances on the road.  So there was good reason to pursue Armstrong, but much less to pin on Froome and Sky.  This is why calls for data transparency are timely.

[EDIT: I’ve got quite a few comments about the LA/CF comparison I made above both here and elsewhere.  I’ve a good mind to delete it because I think it detracts from the point I’m trying to make (that power output is interpretable in context but it’s always likely to be very difficult).  I’m not going to delete it, however, because it would make the post more boring, and writing really bloody boring stuff is something I’m already pretty good at.  My point to those arguing over this is that (like the rest of the post, actually) context is everything.  We know most of the details surrounding Armstrong’s “inverted U-shaped” career progression, in which he went from an aggressive stage racer/breakaway specialist to cancer patient to GC domination.  In this, he went from not really being notable as a climber to dropping Pantani. That’s astonishing.  Froome burst onto the scene in late 2011, but had been with Sky for about 18 months before that, and showed some potential between bouts of illness and injury.  The impact of these is not certain because, again, of the context. Sky, due to its links with British Cycling, was and still is awash with riders who are good against the clock, so he wasn’t in the team to be the time triallist.  He was clearly there to be part of the Sky train, in a team hell-bent on GC success with Bradley Wiggins.  In that context, Froome’s rise to prominence was not particularly fast, but was perhaps unexpected given the circumstances at the time. Importantly, he didn’t completely change his style of riding to break through.  I am not naïve enough to appreciate that there aren’t other explanations, but, again that’s not what the post is about, and I’ve tried to avoid drawing any of those conclusions. Back to the day job…]

Reflecting on the Research Excellence Framework 2014

January 24, 2015

The results of the 2014 REF are out, and by and large, the results were positive for the sector as a whole and our unit of assessment (UoA 26) in particular.  I am no fan of the REF, but it is something we are stuck with in UK universities, in the same way that we are largely stuck with insanely difficult-to-remember passwords that seem to need changing every five minutes.  As a process, it is still a long way from perfect: we are still not told which papers receive which ratings: is that too much to ask HEFCE? We had to do that ourselves about 4 times in the run-up to submission. Couldn’t you just, y’know, not systematically destroy the specific output feedback next time? Especially when each panel member (usually professorial-level researchers) had to review and rate >500 outputs. Each. But as usual I digress. The opacity of the exercise is not my point this time.

The sports-related studies UoA returned results that were considerably improved compared to last time, even considering the general grade inflation seen.  It is no exaggeration to say the results in the equivalent panel were pretty poor last time, in part for political reasons (or so I’m told). One reason for the improvement is that the study of sport, leisure and tourism is maturing – as are those working in the area!  Another, I think, is that our discipline is extremely good and demonstrating impact. London 2012 helped that, but we are also making significant in-roads into areas of disease management and public health, previously the preserve of clinical medicine, for example.

Having spoken to various people, including heads of schools and panel members, the general perception is that the REF was broadly fair in the sense that whatever strategy you adopted you were rewarded for it. My institution favoured staff inclusivity rather than maximising 4* submissions, and we have done well on that score. Post-1992 universities were more selective, justifiably so because they emphasise non-research work and income rather more than the Russell Group. Many of these did well on “Grade Point Average” but less well on “Research Intensity”. (As an aside, I think all of these metrics are bullshit, as are the league tables they feed into, but again we are stuck with them for the short term at least. I reserve the right to swear about them when they do pop up though. The twats.)

One idea that is currently doing the rounds in our institution has actually surprised and delighted me. It is that if you want to maximise 4* output – World-leading research – perhaps most people should slow their research down. That is, rather than try and publish multiple papers every year, make sure you spend at least a year on each paper. Do all the additional controls that usually make up your limitations section. Do that analysis that you could have done but didn’t. In short, chew glass, repeatedly, before submitting the paper. This idea delighted me because it is the way science used to be done before research assessment. It also delighted me because it’s the way I work, and I’ve always felt a little guilty for not publishing more papers than I do. Perhaps I shouldn’t feel that anymore.

If the REF has helped senior people see the benefits of the slow and careful method (which is also “the scientific method”) then that’s a good thing. It is the first sign that the tide might turn on the “Publish or Perish” culture that has developed over the last 20 years. It hasn’t turned yet, of course, as the tragic case of Stefan Grimm demonstrates.

There are still plenty of problems with the REF that we should not be afraid of voicing. In addition to feedback, the disproportionate effort that goes into internal auditing and drafting compared to the actual cash we get from it – which is small in comparison to tuition fees – needs to be addressed.  If we have to continue with a peer-review process (no quantitative metric really seems to do the job without encouraging game-playing), then I’d like to see a single paper per submitted staff member. That would encourage excellence, as well as inclusivity in teaching-intensive institutions. It would also dramatically reduce the reviewing panel’s workload. But I’m probably dreaming.

The conclusion to this brain dump is that the REF is still a nightmare of epic proportions, but it might yield some useful practical outcomes. There is a debate to be had about whether such outcomes are worth it, and whether they will be encouraged in reality. Time will tell.