Skip to content

“EPO does not work”, or how randomised controlled trials will fail if the nuts and bolts are faulty

July 7, 2017

In late June, possibly the most explosive exercise physiology study in several years appeared to great fanfare in the media. The paper of Heuberger and colleagues, published in The Lancet Haematology, was accompanied by several prominent newspapers reporting that human recombinant erythropoietin (EPO) works no better than placebo in well-trained cyclists.  The evidence that EPO works to improve cycling performance is flimsy, and it may actually be useless, or so the story went.  This was a randomised controlled trial, published in an arm of one of the most prestigious medical journals in the world. But this is one of those stories that sounded too good to be true…


The design

The first thing to say about the study is that its overall design is extremely robust: it followed all the accepted procedures required to ensure a rigorous methodology (both practical and statistical), and the authors went to great lengths to ensure that blinding and randomisation were achieved. The study recruited 48 cyclists who were assigned either to a placebo group, who would receive injections of saline, or to an EPO group who would receive injections of EPO each week for 8 weeks.  Before these treatments started, the cyclists performed an incremental exercise test to task failure, and a 45-minute self-paced submaximal test in which the aim was to maximise power output for that duration.  This test was intended to simulate a competitive time trial under controlled laboratory conditions.  At regular intervals during the treatments, the incremental test was repeated.  At the end of the treatment period, the cyclists performed a final incremental test, the submaximal time trial, and a competitive road race set up for the study in which the cyclists actually climbed Mont Ventoux.


The headline results

The study showed, surprisingly, that the EPO group did not perform any better than the placebo during the submaximal time trial, nor during the ascent of Mont Ventoux. Both groups climbed the mountain in just over 1 h 40 min.  In contrast, the administration of EPO did improve the maximal oxygen uptake (VO2max) as well as the maximal power output in the incremental test.  But, given the null results for the performance tests, which were deemed of greater relevance than the incremental results to competitive cycling, the conclusion drawn was that EPO was ineffective in improving cycling performance.


Are these conclusions valid?

We can approach the answer to this question in a number of ways, and I’m going to focus on two of these: the nature of the tests and the physiology of elite cycling. For the conclusions above to be valid, the tests used must be valid and reliable.  The tests should also reflect the physiology of road race cycling in order to be able to generalise the results to the competitive situation. Here is where the study begins to break down.


The incremental exercise test, itself much maligned by the authors as not being representative of the competitive situation, produced results that are entirely in line with previous studies on EPO administration: the increase in haemoglobin concentration, and thus red cell mass, increased VO2max and thus maximal aerobic exercise performance.  This has been demonstrated repeatedly in the literature and it is precisely why cyclists and other athletes have abused this drug since the late 1980s.  But what of the charge that EPO has no effect on tests more relevant to cycling competition?


The submaximal time trial was performed before and after the treatment phase of the study, and both groups improved their performance on this test at the end of the treatment. There was, however, no difference in performance (as mean power output) between the groups.  The Ventoux race backs this result up, and it was this test that grabbed the headlines, largely due to the history associated with what Lance Armstrong has called “that fucking mountain”.  The authors themselves conclude that EPO “did not improve submaximal exercise test or road race performance” (p. 12).  There is a serious problem with this conclusion: the Mont Ventoux race was only conducted once on a windy day, and so there is no basis on which to suggest that road race performance was improved or not.  You can only speak of improvements in something if you measure that something more than once. If I measured my body mass only at the end of a diet, I doubt you’d agree that the diet “didn’t work” unless I showed you what the scales read before I started it.  But this is exactly analogous to the way in which the Ventoux race was used.


Never mind though, the submaximal test still shows no effect, right? Well, not quite, actually.  You see, there has been a fierce debate in exercise physiology about which tests should be used to quantify exercise performance and monitor the effect of interventions designed to enhance performance.  One school of thought is that time trials are the best because the variance in performance is lower than time to task failure tests (the incremental test used here is, effectively, a time to task failure test, albeit with an increasing power output).  Another viewpoint contends that time to task failure tests are useful because you can measure and interpret the physiological responses as well as having a performance measure, something you cannot do if you allow the cyclist to choose what power they produce.  The truth, as usual, is somewhere in the middle, as Amann and colleagues demonstrated nicely some years ago.  Consequently, when performance-related studies are done on a particular theme (e.g., dietary nitrate), studies are done using both types of test for completeness; both have their merits and drawbacks.  Some studies even do a bit of both (by doing a pre-load constant power phase, followed by a short time trial).


The major drawback of time trial type tests is that they require very careful familiarisation. Performing a self-paced 45-minute effort on a stationary ergometer is a novel task. As with any novel task you get better at it when you attempt it a second time for no reason other than previous experience.  You may even do better the third time.  This learning effect needs to be eliminated before using the test for research purposes.  In the study in question, no such familiarisation took place.  We cannot know, therefore, how much of the change in submaximal performance was due to EPO, and how much was due to learning how to do the test.  Simply stated, the submaximal test is not likely to be a reliable indicator of the effects (or not) of EPO in the way it was performed in this study.


Cycling physiology

The authors of the Lancet study can be forgiven for wanting to test cyclists under conditions that closely mimicked competitive cycling. Unfortunately, arguably neither test achieved this.  Moreover, neither the submaximal time trial nor the Mont Ventoux race challenged the cyclists in a way that would reveal EPO’s effects.  A 45-min time trial is likely to be performed in the upper reaches of the so-called ‘heavy’ domain (above the lactate threshold, but below the critical power/maximal steady state).  At best, the cyclists would be exercising at or slightly above the critical power.  The point here is that the vast majority of cyclists will be exercising at 85% VO2max or less.  The influence of EPO on performance at such intensities is likely to be small.


The benefits of EPO are most obvious at intensities performed close to VO2max, (see Wilkerson et al., 2005, for an example of this), yet neither of the performance tests used in the study did this.  The authors contended that maximal performance tests, usually lasting less than 20 minutes, are less relevant because cyclists perform for longer than that “most of the time”.  The problem with this statement is that what cyclists in a grand tour are doing “most of the time” is trying NOT to exert themselves!  It’s why they stay in the peloton.  It may be difficult to believe, but I have it on good authority that the average power sprinters produce on a flat stage is often less than 200 watts, such is the protection offered by the peloton and an organised team.


Mountain stages, which represent the stages in which Grand Tours are won and lost, clearly require greater effort than sprint stages, but even here teams work to protect their team leaders, offering a series of wheels to follow until the last 3-5 km of a major climb. As a result, the General Classification contenders may only exceed their critical power/maximal steady state for the last 15-20 minutes of a mountain stage.  At this point their VO2 will be close to VO2max (which itself will be diminished due to altitude).  It is only at this point that EPO, if used, will reveal its effects.  In short, performance tests lasting 10-20 minutes are exactly the kind of tests you’d want to do if you were trying to find out if EPO worked or not.  To dismiss them in favour of unfamiliar and/or poorly controlled race simulations is folly.


The take home message here is that even though the study of Heuberger and colleagues was a well-structured randomised controlled trial, it was always going to succeed or fail on the strength of the physiological tests within it. The submaximal trial and the Mont Ventoux race have so many problems associated with them that their outcomes are difficult, if not impossible, to interpret.  Consequently, when somebody tells me “There’s a study that shows EPO does not work”, I’ll reply “Delete “not”, pal”.

4 Comments leave one →
  1. robin parisotto permalink
    July 9, 2017 4:37 am


    Concur with your comments. While I am no expert with regards to the physiology of performance (but rather haematological aspects of sports performance) the glaring issue for me with regards to this paper was regarding the climb up Mont Ventoux. Was it performed under ‘race’ conditions where there may be numerous breakaways involving maximal efforts and might be repeated? Or were the subjects in this cohort just riding up in one continual effort. As you point out it is during these extra efforts where EPO (any blood manipulation for that matter) would come into its own. While on the one hand I commend the researchers on their intentions (that all drugs on banned list should be investigated for real performance benefits) on the other hand their conclusions are very questionable with regards to EPO specifically.


    • July 9, 2017 8:53 am

      Hi Rob

      Yes, the Ventoux trial was strange. The paper doesn’t clearly explain how this was conducted. It seems the two groups went up in one peloton as if racing. That makes it quite possible that the placebo group simply rode on the wheels of the EPO group, resulting in similar performances. The weather conditions on the day (85 km/h winds on the summit) make the use of drafting a serious confounding factor in group comparisons. Like you say, it’s a commendable effort overall, but if I’d advised them on the design I’d have said “delete that” on mention of the Ventoux race!

  2. Jamie Langley permalink
    July 10, 2017 11:39 pm

    In relation to EPO having maximal effects on exercise performed at close to VO2 max, do you think this has any bearing on why in athletics we have not seen any progression in the 5000m & 10000m WR’s over the last 10-15 years but why the marathon WR has continued to tumble?




  1. EPO, doping and Ventoux – The Blood Bin

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: