Skip to content

That time I ended up unemployed…

January 24, 2021

Because reasons, I am currently unemployed.  I’ll get to those reasons in a minute.  They are good reasons in the grand scheme of things, but the path I’ve trodden has not been easy, and probably will not be for a while yet.  I am not the only one walking such a path.  There are several hundred thousand of us already unemployed due to the Coronavirus pandemic, and there are likely to many thousands more before this is over.  It’s the first time in 20 years of work that I’ve been unemployed, and I’m hoping that it won’t be for too long.  We are also reasonably comfortable financially, and without that comfort the stress level would be considerably higher.  As a result, I consider myself one of the “lucky unlucky” ones.  But what effect this has had on me is what I want to explore.

My predicament is largely self-inflicted; it is only tangentially related to the Coronavirus pandemic.  Just before the pandemic got properly serious, my wife and I were considering the future anyway, and for reasons associated with the finances of the university we both worked at, and we decided that moving on might be worth exploring.  Posts at a prestigious university came up, we both took a punt and got through to the final interview (all online by this point).  Two weeks later, my wife got an offer of employment.  It was an offer that would be very difficult for her to refuse, and I was keen for her not to.  I was, of course, anticipating that I too would get the nod, but alas this was not to be.  So, we had a quandary: do we stick or twist?  Sticking would not necessarily be the safe thing to do, and my wife had the best opportunity of her professional life.  The only option, as I saw it, was to take her opportunity, leave my job, and hope for the best.

The plan was that I would take voluntary severance, which provided a reasonable redundancy package, and spend some time writing whilst I looked for a job nearby our new home.  The advantage of this was that I could help our son settle into school whilst my wife settled into her new job.  So far, so good.  And then it hit me. My career could be over. They didn’t employ me because I wasn’t good enough, and I never will be, I thought.  I liked to think I was one of the best in the world at what I did, but now I had objective evidence that I’m not even the best sport scientist in my own house.  I would look in the mirror and not really know who it was who was staring back at me.  I would look at my wife and son and feel like I’d let both of them down.  I needed help, and for a couple of weeks I was completely unable to function.  I got over this, with help from my wife, who has been amazing, from friends and family, and mental health professionals, but it was a very dark time. Those feelings come back from time to time, often with little or no warning. I’m just a bit better at controlling them.

I can’t help but reflect on the fact that if I was female, taking a career break to help my husband’s career would be perfectly normal.  What kind of husband would I be if I didn’t do the same?  There was never a doubt in my mind that this change was the right one to make.  But it nearly destroyed me, and that meant that I was unable to celebrate my wife’s career success in the way I should have done.  That made me feel even worse.  I’ve spent the better part of eight months reflecting on what I have lost, rather than what we, as a family, and I, as a scientist, stand to gain.  I’m now in a place where I can start to look forward, rather than back.  But the psychological effects of all this are most likely common to others who have gone through similar upheavals.

We normally think of “the unemployed” as a statistic reported in the media. Just a bunch of numbers.  But each individual is going through one hell of an emotional process.  A feeling of utter uselessness is the obvious one.  If I’m not contributing financially, I am worthless, or so the internal monologue goes, which is, of course, untrue.  But try telling your own brain that.  My unemployment is special in the sense that I received a pay-off from my previous employer, and was kept on to teach out the autumn term.  I will miss all of my colleagues immensely.  Being part of a school, an institution, is something I’ve taken for granted, and I really miss it. 

A severance of this kind goes through specific steps.  The first is to apply.  The second is wait for a week (the “cooling off” period).  In this phase you can change your mind.  I didn’t, and actually felt better when this deadline passed.  Then the letter confirming the legally binding agreement gets sent.  Ever since, I’ve felt like I’m on borrowed time.

What I didn’t anticipate was the degree to which my job defined me.  I felt like I was nothing without it.  But, of course, I am still a scientist and have a brief opportunity to explore things that I probably wouldn’t be able to if I was in a full-time academic role.  But to take those opportunities requires a frame of mind I’m not sure I’ve achieved yet.  Who wouldn’t like a year off to just write stuff?  That is more or less what I have.  But it’s extremely difficult to focus on that when there is no assurance that there will be a job at the end of it all.  This is not a Disney film; there is no guarantee of a happy ending. My head feels like it is full of ants, and at least once a day I replay parts of my unsuccessful interview.  The need to move house, and the stress that creates, has certainly not helped.  But that, too, will resolve itself soon.

So I would not recommend the path I’m treading, but it is my path.  I get to decide which direction it takes (within certain limits – like where we live), and that is exciting.  But one person’s excitement is another’s fear, and these emotions just happen to both occur in me.  The worst year of my life could well lead to the best, so forwards I go, Dr Mark Burnley: husband, father, scientist, unemployed guy.


Exercise intensity domains and phase transitions: the power-duration relationship

August 31, 2020

The exercise intensity spectrum has been a fascination of mine for as long as I can remember, and I’ve been lucky enough to work on it experimentally for the last 16 years or so.  In doing so, I made contributions to the development of the 3 minute all-out cycling test, the examination of the relationship between neuromuscular fatigue and critical torque, and the study of physiological complexity during fatiguing exercise above the critical torque.  Most of these investigations are available for free in the public domain (references at the bottom).  Most recently, we investigated a thing that has been bugging me for ages.  That thing being whether or not critical power (or speed, torque, force etc.) can be described as a sudden threshold or a more gradual phase transition.  In other words, do the physiological thresholds we all discuss so much represent a single point in the intensity spectrum (e.g., 300 W), or is there a band of intensity, a grey area if you will, that the landmark is characterised by (e.g., 290-310 W)?  Before we get to that, I’d like to outline what the exercise intensity spectrum and its physiological significance looks like to me.


Physiological landmarks


When athletes perform tests of aerobic function, such as incremental exercise (often called a “VO2max” test), several fitness parameters can be determined.  The point at which lactate begins to increase in the blood is one such parameter, known as the lactate threshold.  This threshold also separates two exercise intensity domains: exercise performed below the lactate threshold is known as ‘moderate-intensity exercise’, and exercise above the lactate threshold is known as ‘heavy-intensity exercise’.  The upper limit of heavy-intensity exercise is generally agreed to be the highest running speed or power output at which a steady state can be attained.  There are two landmarks that can estimate this boundary, namely the maximal lactate steady state, and the so-called critical power.  Exercise performed above this boundary is known as ‘severe-intensity exercise’.  Below we will see how the physiological responses behave in each exercise intensity domain, focusing mostly on how the body matches the energy requirements with aerobic metabolism (if it can).


Moderate exercise:

This domain is perhaps the simplest, because exercise at these intensities the body rapidly reaches a steady state, in which the energy demand of the task is relatively rapidly met by aerobic metabolism.  For this to happen, the body adjusts the rate of oxygen uptake after exercise starts until oxidative metabolism provides essentially all of the energy required.  This is the “steady state”.  If lactate increases in the blood, it does so only transiently.  The respiratory exchange ratio stays below 1.0, meaning that less carbon dioxide is exhaled per unit time than oxygen taken up.  Because the amount of air you breathe in and out every minute is closely coupled, one way or another, to the rate of CO2 output, the demand on the lungs and respiratory muscles is relatively small.  The typical oxygen uptake (VO2) response to moderate exercise looks like this:


Figure 1: the oxygen uptake response to moderate-intensity exercise. Notice the attainment of a steady state in VO2 in less than ~3 minutes.  Also notice the relatively low VO2 response (~50% of VO2max. In this participant, VO2max was 4.25 L.min-1).


The tolerable duration of moderate-intensity exercise has never been precisely established, but we do know that you should be able to sustain it for several hours provided you don’t succumb to injury or hyperthermia.  The fatigue mechanisms are also unclear, although the weight of evidence suggests that, without the involvement of heat stress or high-altitude, central mechanisms of fatigue are dominant, including disruptions to brain neurotransmission resulting in a loss of ‘drive’ to exercise.


Although the moderate-intensity domain is regularly frequented by team sports athletes in recovery from sprints or high-intensity phases, or in repositioning manoeuvers, endurance sports that occupy this domain are almost exclusively ultra-endurance events lasting more than 2 hours.  For example, in a 4-5 hour flat or ‘sprint’ stage of a cycling race, the riders will spend the vast majority of their time in the moderate domain, with the main sprinters only exiting this domain in the final sprint itself.  Prior to that, they will spend the entire stage in the peloton, protected by teammates from exerting any significant effort.  In short, events longer than the marathon are characterized by being performed in the moderate-intensity domain, and the predominant fatigue mechanism is likely to be central in nature.  At these intensities, feeding and hydration are usually easy to ensure with correct planning.  In fact, some of these events are so long that the problem could be access to too much food and fluid, rather than too little!


Heavy exercise:

When athletes exceed the so-called lactate threshold, they enter the ‘heavy-intensity domain’. In this domain, blood lactate concentration is elevated above resting levels but stabilizes, and oxygen uptake reaches a steady state too.  Importantly, it takes 10-20 minutes for oxygen uptake to reach a steady state, and when it reaches this steady state, the oxygen uptake is higher than would be expected if the exercise was moderate.  This slowly increasing O2 uptake is known as the “VO2 slow component”.  This higher oxygen cost of exercise has a number of consequences for fatigue and exercise tolerance, as we will see below.  A typical heavy-intensity oxygen uptake response looks like this:

Figure 2

Figure 2: the oxygen uptake response to heavy-intensity exercise. Notice the greater VO2 response than moderate exercise (~80% VO2max), and the delayed attainment of a steady state, in which VO2 is higher than would be predicted from the responses to moderate exercise.



The fatigue processes that occur during heavy-intensity exercise include both central and peripheral mechanisms.  Peripheral fatigue is not likely to be caused by high-energy phosphate depletion or accumulation, since a steady state in O2 uptake, blood lactate concentration and, when measured, phosphorylcreatine (PCr), inorganic phosphate, muscle lactate and pH.  Classically, the depletion of muscle glycogen has been identified as the likely cause of fatigue during prolonged heavy exercise.  Multiple lines of evidence support this assertion.  Most recently, it has been demonstrated that there are specific pools of glycogen in a muscle fibre, and the pool depleted most rapidly plays a critical role in excitation-contraction coupling.  This may explain the gradual, rather than catastrophic, loss of muscle force output as exercise progresses and individual fibres become unresponsive to excitatory input.  In the heavy-intensity domain, the higher O2 cost that occurs as a result of the development of the VO2 slow component will result in a faster utilization of the glycogen stores.  It is not surprising, therefore, that marathon runners select a pace that is just above the lactate threshold when racing.  If they ran any faster, they would increase the energy demand of running and risk ‘hitting the wall’.


Central fatigue also occurs in the heavy-intensity domain, but its cause is far from certain.  Some of the mechanisms described above for moderate intensity exercise could also occur during heavy exercise, but because alterations in brain neurochemistry, for example, take several hours to develop, it seems unlikely to explain the central fatigue that develops within minutes of beginning heavy exercise.  It is more likely that the repetitive activity of the muscles makes them, or their motoneurones, harder to drive (that is, they are less responsive to excitatory input from the motor cortex and spinal cord).  Whether these structures are also affected in this way, putting the cause of central fatigue further ‘upstream’ in the CNS, is currently unclear.  An additional component of central fatigue may be related to muscle glycogen depletion: in the latter stages of heavy exercise, blood glucose concentration begins to fall.  This may rob the brain itself of fuel, leading to the confusion, lethargy, and irritability characteristic of somebody ‘hitting the wall’.


Endurance events that take place in the heavy-intensity domain include half-marathon and marathon running, as well as many cycling time trials and 10,000 m swimming.  The first (and so far only) human-powered flight across the English Channel took 2 hours and 49 minutes in 1979.  This supreme feat of both engineering and endurance is described in Wilkie’s biographical sketch, and it is clear that this would only have been possible in an individual sufficiently fit to produce more than 250 W and maintain a steady state. Bryan Allen was that individual.


The upper limit of the heavy intensity domain is, by definition, the highest power output or speed at which a steady state can be maintain.  This boundary point is surprisingly difficult to measure, but two methods have been used extensively.  The first is to perform a series of constant speed trials of up to 30 minutes and identify the point at which a steady state is no longer possible.  The other is to perform several exhaustive exercise bouts in the severe-intensity domain and construct the hyperbolic power-duration relationship.  The point at which this curve flattens out (i.e., when it reaches an asymptote) is known as the ‘critical power’ (analogously, the ‘critical speed’ in running).  Exercise above this point will undoubtedly be non-steady state, and therefore severe.  An analysis of elite marathon runners illustrates the importance of this upper boundary: a selection of the fastest marathon runners of all time shows that all of them performed the marathon at or below the critical speed.  Their ability to run so close to this boundary is likely to be a consequence of them also possessing a very high lactate threshold running speed as a fraction of their VO2max.


The most common and accepted means of countering fatigue during heavy-intensity exercise is to regularly ingest carbohydrates during performance, since it has been shown that late in exercise almost all of the carbohydrates used in the muscle come from blood glucose.  As dehydration can also occur in prolonged heavy exercise, these carbohydrates are often consumed in drinks, or as gels washed down with water.  However, ingesting food and fluids during heavy exercise is not easy, and feeding strategies must be trialled in training to avoid underfeeding or the discomfort of overfeeding.


Severe exercise:

Above the critical power (or critical speed), it is not possible to achieve a metabolic steady state in.  Oxygen uptake, blood lactate, muscle PCr, inorganic phosphate and pH fail to stabilise, and exercise duration is limited to 40 minutes or less.  In this intensity domain, the VO2 slow component drives VO2 upward until VO2max is attained.  Task failure occurs soon after VO2max is attained.  The higher the power output above critical power, the more rapidly the slow component increases, and the more rapidly muscle metabolites deplete and accumulate.  All else being equal, exercise terminates when the aforementioned metabolites reach surprisingly similar levels, despite wide variations in exercise duration.  An example of the oxygen uptake response to severe-intensity exercise is shown below.

Figure 3

Figure 3: the oxygen uptake response to severe-intensity exercise.  Notice the absence of a steady state and VO2 rising until VO2max is attained.  The power output in this test was 295 W, 45 W higher than the critical power, and resulted in exhaustion in approximately 12 minutes.


The peripheral mechanisms of fatigue during severe-intensity exercise appear to be directly related to the depletion of high-energy phosphates (PCr) and the accumulation of metabolites associated with them (inorganic phosphate and protons).  Both reduced muscle pH and elevated inorganic phosphate concentrations, alone or in combination, have been shown to reduce muscle force output and/or shortening velocity.  This appears to be the result of direct effects on muscle crossbridge function as well as diminished calcium release and uptake (for a review see Allen et al., 2008).  These findings are often derived from experiments performed in vitro, but importantly it has been shown that similar metabolite changes can be observed during exercise in vivo using biopsy samples or magnetic resonance spectroscopy.


Central fatigue has also been shown to occur during severe-intensity exercise, although again the mechanism underpinning this is unclear.  In addition to reduced motoneurone excitability mentioned above, a further potential source of reduced voluntary activation in this domain is afferent feedback from the fatiguing muscle.  It has been shown that populations of thinly myelinated or unmyelinated afferent fibres are sensitive to substances produced during muscular contraction.  These ‘metaboreceptors’ are thought to produce inhibitory input to various parts of the CNS, reducing voluntary activation and thus the drive to the muscle.  The combination of direct and indirect effects of metabolic changes during severe-intensity exercise therefore appears to be pivotal in the fatigue processes above the critical power.


The tolerable duration of severe-intensity exercise ranges from ~2-40 minutes, meaning that the majority of athletic events labelled “endurance” occur in this intensity domain.  This is why the fatigue processes identified for 5,000 m and 10,000 m running in a previous sub-section were so similar.  The common fatigue mechanisms in this intensity domain also makes the tolerable duration of severe exercise highly predictable using as few as two parameters: the critical power (the asymptote in power output) and the degree of curvature in the relationship between power output and time.  This hyperbolic relationship theoretically provides a measure of the highest power output that can be sustained without fatigue (the critical power), and a measure of the amount of work that can be performed above critical power before task failure occurs (labelled W’).  In reality, of course, fatigue is not absent when exercising at critical power.  Rather, it would be more accurate to say that the critical power represents the highest power output that can, in principle, be maintained without experiencing a progressive metabolically-mediated fatigue. The power-duration relationship is shown below.

Figure 4

Figure 4: the power-duration relationship.  In this example, a participant performed four exhaustive constant-load cycling bouts on separate days.  Plotting the time to exhaustion against power produces a hyperbolic relationship of the form: Tlim = W’/(Power – CP), where Tlim is time to exhaustion, W’ is the parameter that defines the curvature of the relationship and CP is the critical power (the asymptote of the power-duration relationship, dashed line).


The power-duration relationship seen above is a very important concept in exercise physiology, because the critical power represents the ‘red line’ in performance.  Exceeding the critical power means that an athlete has only minutes of sustainable exercise left before ‘exhaustion’ occurs.  Consequently, this concept is often used by athletes to develop pacing strategies for races in order to optimise performance, or to exploit perceived weaknesses of opponents.  If athletes get their pacing strategies wrong, and use up the work capacity reflected in the W’ before the finish line, they will slow down dramatically.  This is frequently seen in the final lap of races lasting 1,500 m or longer on the track.  It is also commonly seen in events like the 4000 m team pursuit, in which one of the riders may fail to make the finish line due to depleting their W¢ too early.  On the other hand, this can sometimes be a deliberate strategy to use up that rider and allow the other riders to draft them before the final effort.


Thresholds or phase transitions?


As useful as exercise intensity domains and their landmarks are, there has been a tendency in the literature, and particularly in practice, to assume or demand unrealistic levels of precision in their measurement.  It has also been assumed that maximal lactate steady state (MLSS) is THE gold standard measure of the heavy-severe domain boundary, and that CP is inferior.  We have disputed this in a previous viewpoint, but the bigger issue, for me, is the assumption that either MLSS or CP can be recorded to the nearest watt or km/h.  In reality, the above hyperbolic curve is always subject to measurement error, and these errors can be surprisingly large.  For example, if you perform 3 predicting trials, and use a 2-parameter equation to fit the data, you are left with 1 degree of freedom.  This means that the standard error (say, 5 W) must be multiplied by ~12.71. This means the 95% confidence limits in this case would be ~64 W in each direction! Adding predicting trials would reduce both the standard error and the confidence limits, but there would always be some error in the prediction of CP.  One thing we were interested in exploring was how big this error is, functionally speaking.  In other words, how far above CP can you go before you observe consistent severe-intensity behaviour?


To do this, we used a well-established intermittent isometric contraction model to study the “critical torque”.  The whole study is free to view here, but the key observation was that we failed to observe consistent severe-intensity behaviour in a range of fatigue-related variables when contractions were performed at least two standard errors above the point estimate of the critical torque.  This may not seem too surprising given that we were still inside the 95% confidence limits in at this point, but previous literature has suggested that critical power itself should be non-steady state.  We also saw some evidence of severe behaviour on very rare occasions below the critical torque, and plenty of evidence of apparently heavy intensity behaviour above critical torque.  This led us to conclude that the critical torque is best described as a phase transition rather than a sudden threshold.  The grey area is not just a statistical quirk; it is reflected in the underlying physiological responses.


Where does this leave the exercise intensity spectrum, its landmarks and especially the critical power concept? First, if you perform a “coarse grained” analysis of exercise intensity, the domains remain distinct and the transitions between them abrupt.  They are still the foundation of many other concepts in exercise physiology, and the distinct physiological and fatigue responses within each domain are still crucial to appreciate, understand and apply.  But a fine-grained analysis of the physiological landmarks show that the abrupt transitions are anything but.  Critical power, lactate threshold, or indeed any other threshold marker, will always be associated with a band of uncertainty around them, of the order of about ±10-15 W in the case of cycling.  This uncertainty probably has a physiological basis that you will never avoid: variations in metabolism, blood flow, muscle recruitment will tend to smear these thresholds across a band of power or speed.  This means that if you want to exercise strictly “below critical power” it should be at least two standard errors below critical power, and vice versa above it.


Where does it leave my view on the power-duration relationship and its physiological basis? I personally think the concept is strengthened by the phase transition idea, since this agrees with, rather than is at odds with, the underlying heterogeneities of the physiological response to exercise.  The grey area around the point estimate of these landmarks makes them feel more real to me. And I like that.



Burnley, M., Doust, J.H. & Vanhatalo, A.  (2006).  A 3 min all-out test to determine peak oxygen uptake and the maximal steady state.  Medicine and Science in Sports and Exercise 38, 1995-2003.


Burnley, M.  (2009).  Estimation of critical torque using intermittent isometric maximal voluntary contractions of the quadriceps in humans.  Journal of Applied Physiology, 106, 975-983.


Burnley, M., Vanhatalo, A. & Jones, A.M.  (2012).  Distinct profiles of neuromuscular fatigue during muscle contractions below and above the critical torque in humans.  Journal of Applied Physiology, 115, 215-223.


Jones AM, Burnley M, Black MI, Poole DC & Vanhatalo A. (2019). The maximal metabolic steady state: redefining the ‘gold standard’. Physiological Reports, 7, e14098.


Pethick, J., Winter, S.L. & Burnley, M. (2015). Fatigue reduces the complexity of knee-extensor torque fluctuations during maximal and submaximal intermittent isometric contractions in humans. Journal of Physiology, 593, 2085-2096.


Pethick, J., Winter, S.L. & Burnley, M. (2016).  Loss of knee extensor torque complexity during fatiguing contractions occurs exclusively above the critical torque.  American Journal of Physiology Regulatory Integrative and Comparative Physiology, 310, R1144-R1153.


Pethick J, Winter SL & Burnley M. (2020). Physiological evidence that the critical torque is a phase transition not a threshold. Medicine and Science in Sports and Exercise, in press.


Vanhatalo, A., Doust, J.H. & Burnley, M.  (2007).  Determination of critical power using a 3-min all-out cycling test.  Medicine and Science in Sports and Exercise 39, 548-555.


Vanhatalo, A., Doust, J.H. & Burnley M.  (2008).  Robustness of a 3 min all-out cycling test to manipulations of power profile and cadence in humans.  Experimental Physiology 93, 383-390.


Guidelines and Expectations for PGR Students – Guest post from Prof Andy Jones

December 20, 2018

Professor Andrew M. Jones from Exeter University has written a superb set of basic guidelines for research students to consider and has kindly shared them with me.  In an age of target and metric-driven academia, it gets to the essence of what being a postgrad is all about, and stands as a good indicator of what a great supervisor Andy was to me and is to others still. He’s a professor who has never lost contact with the lab, and is always keen to roll his sleeves up and get involved. Usually these sleeves are part of a Gary Numan or Rammstein shirt, which only makes it better.  So here they are:

1. Integrity

A research degree involves the pursuit of the truth. Sometimes a hypothesis will be accepted, sometimes it will be rejected; either way, you will employ a scientific process and you will be contributing to the development of knowledge. You should conduct research honestly and diligently and not in haste or for self-gain. Never be tempted to cut corners. You need to be able to look at yourself in the mirror and know that your research was ethical and authentic. Research is hard work and it can involve challenges and disappointment as well as joy – but the journey is just as important as the destination. Take no short cuts. Have exemplary standards and aim for excellence.

2. Organisation and Attention to Detail
Keep a lab book and note down everything relevant to your research and your experiments, and make ‘to-do’ lists to keep yourself on track. Learn how things work and develop your lab and technical skills to a high standard. Plan, conduct and report your research meticulously and with the utmost attention to detail so that you and others can replicate your methods.

3. Data Storage and Analysis
Store your data carefully and become adept at using spreadsheets and statistical software and graphical programmes for interrogating your data and presenting your results. Label everything so you always know which version you’re working on.

4. Deadlines
Creating and hitting deadlines is important for progress. If you negotiate a deadline with your supervisors, try to produce your piece of work within the agreed timeframe. Your supervisors will consider your work carefully and provide detailed feedback; you should respond to that feedback with equally careful revisions. Your supervisors will be much more impressed with the care you take than the speed with which you return the next version.

5. Team Work
Research is a team game. At any time, there will be people with more knowledge and experience than you in the research group and other people with less. Everybody gains if experience and knowledge are shared. Be generous with your time: be a study participant; help out in the lab; share data; read draft manuscripts; discuss research articles – others will do the same for you and your research and career will benefit. Never be afraid to admit you don’t know how to do something nor to ask for help from your team.

6. Courtesy and Etiquette
Be helpful and understanding to people both within and outside your research group. In particular, be nice to professional services staff in the department and university: the administrative staff, security staff and, maybe especially, the technical services staff. You’ll enhance your reputation and, when you need them (which will be often) they’ll be much more inclined to help you out! Remember that lab technicians are there to advise and help you but not to do your work for you. Use e-mail wisely: check your wording and attachments and don’t send a stream of e-mails when one good one will do.

7. Trust
Have confidence in your supervisors’ ability to guide and coordinate your research activities to your best advantage. Appreciate that, at times, research can be competitive and treat new results confidentially. Consult your supervisors before reaching out to people beyond your immediate team for advice.

8. Reading and Writing
Read widely. You should become the master of your research topic. Read the newest literature so that you’re up-to-date but also read the classics and read other papers you find interesting too. The best way to improve your own writing is to read good scientific literature. Scientific writing is a skill and it takes time to develop. Expect to write multiple drafts of every article. Keep honing every sentence until they’re ‘perfect’ but realise draft articles can always be enhanced so don’t be shy about sharing them. Always seek to clarify and not to obfuscate.

9. Appreciate Lineage and Legacy
Know the history of your research topic and therefore your place in the story. Learn about the people that came before you that made discoveries on which your work builds. Ask your supervisors about their successes and failures and learn from their experiences. Practise humility. Celebrate your own achievements, take note of your errors, and keep it all in perspective.

10. Conferencing
Research involves communication in writing but also in person. Attend research seminars in your department even if the topic appears to lack direct relevance; you’ll always learn something including about effective presentation techniques. Work on your ability to communicate your research findings and their implications simply and succinctly both to fellow scientists and to members of the general public who may benefit from them. Remember that you are an ambassador for your research team. Represent.

11. And Finally!
Stay excited. Conducting original research is an enormous privilege. It is so exciting to have the opportunity to discover new facts and to communicate them across the world. Appreciate that opportunity, enjoy your studies and have fun.


Andy Jones



“EPO does not work”, or how randomised controlled trials will fail if the nuts and bolts are faulty

July 7, 2017

In late June, possibly the most explosive exercise physiology study in several years appeared to great fanfare in the media. The paper of Heuberger and colleagues, published in The Lancet Haematology, was accompanied by several prominent newspapers reporting that human recombinant erythropoietin (EPO) works no better than placebo in well-trained cyclists.  The evidence that EPO works to improve cycling performance is flimsy, and it may actually be useless, or so the story went.  This was a randomised controlled trial, published in an arm of one of the most prestigious medical journals in the world. But this is one of those stories that sounded too good to be true…


The design

The first thing to say about the study is that its overall design is extremely robust: it followed all the accepted procedures required to ensure a rigorous methodology (both practical and statistical), and the authors went to great lengths to ensure that blinding and randomisation were achieved. The study recruited 48 cyclists who were assigned either to a placebo group, who would receive injections of saline, or to an EPO group who would receive injections of EPO each week for 8 weeks.  Before these treatments started, the cyclists performed an incremental exercise test to task failure, and a 45-minute self-paced submaximal test in which the aim was to maximise power output for that duration.  This test was intended to simulate a competitive time trial under controlled laboratory conditions.  At regular intervals during the treatments, the incremental test was repeated.  At the end of the treatment period, the cyclists performed a final incremental test, the submaximal time trial, and a competitive road race set up for the study in which the cyclists actually climbed Mont Ventoux.


The headline results

The study showed, surprisingly, that the EPO group did not perform any better than the placebo during the submaximal time trial, nor during the ascent of Mont Ventoux. Both groups climbed the mountain in just over 1 h 40 min.  In contrast, the administration of EPO did improve the maximal oxygen uptake (VO2max) as well as the maximal power output in the incremental test.  But, given the null results for the performance tests, which were deemed of greater relevance than the incremental results to competitive cycling, the conclusion drawn was that EPO was ineffective in improving cycling performance.


Are these conclusions valid?

We can approach the answer to this question in a number of ways, and I’m going to focus on two of these: the nature of the tests and the physiology of elite cycling. For the conclusions above to be valid, the tests used must be valid and reliable.  The tests should also reflect the physiology of road race cycling in order to be able to generalise the results to the competitive situation. Here is where the study begins to break down.


The incremental exercise test, itself much maligned by the authors as not being representative of the competitive situation, produced results that are entirely in line with previous studies on EPO administration: the increase in haemoglobin concentration, and thus red cell mass, increased VO2max and thus maximal aerobic exercise performance.  This has been demonstrated repeatedly in the literature and it is precisely why cyclists and other athletes have abused this drug since the late 1980s.  But what of the charge that EPO has no effect on tests more relevant to cycling competition?


The submaximal time trial was performed before and after the treatment phase of the study, and both groups improved their performance on this test at the end of the treatment. There was, however, no difference in performance (as mean power output) between the groups.  The Ventoux race backs this result up, and it was this test that grabbed the headlines, largely due to the history associated with what Lance Armstrong has called “that fucking mountain”.  The authors themselves conclude that EPO “did not improve submaximal exercise test or road race performance” (p. 12).  There is a serious problem with this conclusion: the Mont Ventoux race was only conducted once on a windy day, and so there is no basis on which to suggest that road race performance was improved or not.  You can only speak of improvements in something if you measure that something more than once. If I measured my body mass only at the end of a diet, I doubt you’d agree that the diet “didn’t work” unless I showed you what the scales read before I started it.  But this is exactly analogous to the way in which the Ventoux race was used.


Never mind though, the submaximal test still shows no effect, right? Well, not quite, actually.  You see, there has been a fierce debate in exercise physiology about which tests should be used to quantify exercise performance and monitor the effect of interventions designed to enhance performance.  One school of thought is that time trials are the best because the variance in performance is lower than time to task failure tests (the incremental test used here is, effectively, a time to task failure test, albeit with an increasing power output).  Another viewpoint contends that time to task failure tests are useful because you can measure and interpret the physiological responses as well as having a performance measure, something you cannot do if you allow the cyclist to choose what power they produce.  The truth, as usual, is somewhere in the middle, as Amann and colleagues demonstrated nicely some years ago.  Consequently, when performance-related studies are done on a particular theme (e.g., dietary nitrate), studies are done using both types of test for completeness; both have their merits and drawbacks.  Some studies even do a bit of both (by doing a pre-load constant power phase, followed by a short time trial).


The major drawback of time trial type tests is that they require very careful familiarisation. Performing a self-paced 45-minute effort on a stationary ergometer is a novel task. As with any novel task you get better at it when you attempt it a second time for no reason other than previous experience.  You may even do better the third time.  This learning effect needs to be eliminated before using the test for research purposes.  In the study in question, no such familiarisation took place.  We cannot know, therefore, how much of the change in submaximal performance was due to EPO, and how much was due to learning how to do the test.  Simply stated, the submaximal test is not likely to be a reliable indicator of the effects (or not) of EPO in the way it was performed in this study.


Cycling physiology

The authors of the Lancet study can be forgiven for wanting to test cyclists under conditions that closely mimicked competitive cycling. Unfortunately, arguably neither test achieved this.  Moreover, neither the submaximal time trial nor the Mont Ventoux race challenged the cyclists in a way that would reveal EPO’s effects.  A 45-min time trial is likely to be performed in the upper reaches of the so-called ‘heavy’ domain (above the lactate threshold, but below the critical power/maximal steady state).  At best, the cyclists would be exercising at or slightly above the critical power.  The point here is that the vast majority of cyclists will be exercising at 85% VO2max or less.  The influence of EPO on performance at such intensities is likely to be small.


The benefits of EPO are most obvious at intensities performed close to VO2max, (see Wilkerson et al., 2005, for an example of this), yet neither of the performance tests used in the study did this.  The authors contended that maximal performance tests, usually lasting less than 20 minutes, are less relevant because cyclists perform for longer than that “most of the time”.  The problem with this statement is that what cyclists in a grand tour are doing “most of the time” is trying NOT to exert themselves!  It’s why they stay in the peloton.  It may be difficult to believe, but I have it on good authority that the average power sprinters produce on a flat stage is often less than 200 watts, such is the protection offered by the peloton and an organised team.


Mountain stages, which represent the stages in which Grand Tours are won and lost, clearly require greater effort than sprint stages, but even here teams work to protect their team leaders, offering a series of wheels to follow until the last 3-5 km of a major climb. As a result, the General Classification contenders may only exceed their critical power/maximal steady state for the last 15-20 minutes of a mountain stage.  At this point their VO2 will be close to VO2max (which itself will be diminished due to altitude).  It is only at this point that EPO, if used, will reveal its effects.  In short, performance tests lasting 10-20 minutes are exactly the kind of tests you’d want to do if you were trying to find out if EPO worked or not.  To dismiss them in favour of unfamiliar and/or poorly controlled race simulations is folly.


The take home message here is that even though the study of Heuberger and colleagues was a well-structured randomised controlled trial, it was always going to succeed or fail on the strength of the physiological tests within it. The submaximal trial and the Mont Ventoux race have so many problems associated with them that their outcomes are difficult, if not impossible, to interpret.  Consequently, when somebody tells me “There’s a study that shows EPO does not work”, I’ll reply “Delete “not”, pal”.

2:00:25: Eliud Kipchoge’s lesson in endurance physiology (or biomechanics, or psychology, or nutrition, or…)

May 6, 2017

Earlier on this morning, we witnessed a unique and surprisingly exciting time trial over 26.2 miles in Nike’s Breaking2 project, now being rebranded as #VeryNearlyBreaking2 (allegedly). I thought that, since almost everyone else will, I’d give my thoughts on the event. First, two confessions/declarations: first, my old PhD supervisor and close collaborator Prof Andy Jones was a consultant on this project, as was Dr Phil Skiba who I have got to know very well since he started working with Andy’s Exeter group. Second, I only watched the last 25 km of the event because it was a Saturday and wasn’t going to get up at 4:30 am to watch the first half of a marathon in which almost nothing happens.

I had no inside information about the even, mostly because I didn’t pry, other than the odd “how’s that thing going?” to which the reply would usually be “fine” or “interesting”. Most of what I could glean either was or now is in the public domain, largely thanks to the twitter feeds of Phil and Andy and the work of Alex Hutchinson. But the event itself was, for me, a lesson in the physiology of endurance running and the acute effects of prolonged high-intensity effort. As the post’s title implies, there are plenty of other lessons to learn from this, but this one is mine.

Kipchoge completed the 26.2 miles in a blistering 2:00:25, by ~2.5 minutes the fastest a human has ever run the distance.  At 25 km, he was bang on schedule, but in the last 10 km he lost time, and lost about 15-20 s in the last 5 km alone. Why could he not hold that pace in spite of all of the assistance from the arrowhead of pacers (or the car)? To understand that, I think we need to look at what it takes, physiologically, to run a marathon. We hear a lot about maximal oxygen uptake, lacatate threshold (LT) and running economy, as these largely place limits on what is possible. The maximal oxygen uptake is the size of the engine, the LT is the rev limiter (sort of) and the running economy is the miles per gallon (or, in this case, per gram of glycogen) that the engine can offer. What is less well appreciated is that there is not one “rev limiter” in physiology, but two. Running faster than your LT has implications for endurance because you develop a “slow component” of oxygen uptake which increases the oxygen cost of exercise (it reduces economy, in other words). But the slow component can be stabilised provided you run at a speed slower than what is called your “critical speed”. This is the second rev limiter, and I’m going to argue that it is crucial to Kipchoge’s efforts today.

The critical speed, simply defined (!) is the speed asymptote of the speed-duration relationship (analogous to the critical power in the power-duration relationship). Andy Jones and I have written about these concepts in the scientific literature here and here if you want more detail on the definition.  The key point for this post is that a 2 hour marathon effort requires a running speed that is necessarily below the critical speed, because above this point, the slow component of oxygen uptake cannot be stabilised, leading rapidly to the attainment of maximal oxygen uptake and inevitably to task failure.  But the sustainable pace for a marathon run this fast must have been very close to the critical speed (see Jones & Vanhatalo (2017) for more on this). Thus, Kipchoge’s task was to run as close to his critical speed as possible, and stay there. For the most part, he suceeded.

Kipchoge looked in full control and on pace for much of the Breaking2 effort. But if Kipchoge ran just below his critical speed, why did he slow down in the last 5 km? Well, a normal runner running above LT but below the critical speed is operating in their “heavy domain”, where we strongly suspect glycogen depletion plays a key role in determining exercise capacity, one way or another. The presence of a slow component of oxygen uptake increases the demand on fuel reserves, which probably explains why most studies show marathon efforts being completed just above (but not too far above) LT. We mere mortals cannot get close to critical speed during a marathon lasting 2:30-4:00 hours, but there is good reason to suppose elite athletes can. One reason for that is an apparent “domain compression” in which the LT and critical speed both occur at a very high fraction of maximal oxygen uptake. Another is that elite runners have phenomenal fatigue resistance, in part due to the high percentage of type I (slow twitch) fibres in their muscles.

The above considerations provide a reason for Kipchoge’s basic speed, but not for his slowing down. Obviously, the distance itself places a severe strain on fuel stores, but he was still running exceptionally quickly towards the end, albeit grimacing. He didn’t look like he blew catastrophically. He lost some of the advantaged of drafting as the “arrow” collapsed and the car gapped all of the runners, but again this would probably have a minor influence since this only happened in the last few laps. The effect of heat and dehydration were also probably minimal as it wasn’t hot and he was regularly drinking. That leaves few possibilities for his slowing down. Undoubtedly there would have been some muscle damage at this stage, and we know that progressive, slowly-developing fatigue occurs in the heavy domain. However, the slowing in running pace was not progressive, which leads me to conclude that his critical speed itself may have decreased.

The above idea seems at odds with what we have seen experimentally: fatiguing exercise and glycogen depletion does not seem to alter the critical power in cycling, but these experiments are nothing like prolonged exercise performance in elite athletes.  I have heard anecdotal reports of a diminished critical power at the end of cycle stage races or long-duration time trials. If true, it would explain the fall-off in pace without catastrophic failure.  What makes the 2 hour marathon such a challenge is that it is close to the limit of what we think humans can sustain, and the effort itself likely reduces that sustainable pace in the last 10-15 km.

Kipchoge is the first to treat the marathon like a 2 hour time trial and hold it together for most of the distance. If he can find a way of holding it together for the full duration, an athlete of Kipchoge’s talent really could break 2.

I want my country back! Personal thoughts on the EU Referendum

June 3, 2016

I want my country back. It’s a refrain we’ve heard a lot in the last 5-10 years. It’s generally a call to restore some sense of what being British really means, of what Britain should feel like, and the people Britain should be composed of.  What has made this refrain so commonplace is the sense that Britain has lost its sense of identity to an influx of immigrants from the EU and elsewhere.  If only we could get shot of that damned institution we’d be able to get our country back.

My father-in-law is an ex-pat who lives in France and drives around in a French car. Every once in a while he drives over to see us.  On one such occasion he drove it to a local supermarket in Strood, Kent.  At a set of traffic lights he stopped, and a man ventured towards his car, as if he was going to ask directions.  As my Father-in-law opened the window, he was greeted with “YOU FRENCH BASTARD!” at which point the lights went green. As a result he was unable to explain that he’d lived most of his adult life in Oxfordshire.  It’s an amusing anecdote, but underneath it is something far nastier: Britain has become increasingly hostile to foreign people in the last few years, and sometimes this boils over into xenophobia and racism.  A colleague of mine from Italy who has lived in the UK for a decade is very clear that this xenophobia is relatively new.  It is only in the last few years that strangers have urged him to “fuck off back to your own country” whilst he is walking down the high street.

Immigrants are the modern day bogeymen, easy to label and criticise, but much more difficult to understand. The media and some politicians have effectively dehumanised anybody who isn’t British and use the term “immigrant” to imply that anybody who comes here does so to suck the life and soul out of Britain.  UKIP can take much of the blame, but the Tory tub-thumping on immigration, and Labour’s complete inability to offer an alternative voice has made Britain a very unwelcoming place in the last few years.  I think that the effect of this is much more corrosive than the odd idiot shouting obscenities in the street.

A fact often lost in the EU debate is that EU migrants contribute very significantly to the British economy and British life.  They contribute far more in expenditure and taxation than they claim in benefits. Many of them are highly skilled and would be difficult to replace post-Brexit.  Most sensible politicians know this, which is why they usually talk of immigration numbers, points systems and the like, without actually being positive about EU migrant contributions (that would make them look human, you see, and we can’t have that).  But the corrosive effect of all of this is that these migrants, given the choice, will probably leave the UK post-Brexit, and our country will be much the poorer for it.  This is not because there will be any particular policy that makes them leave, but because the clear attitude of a Britain that votes to leave is that we don’t want to work with YOU.

Although it could be argued that we are just trying to extract ourselves from the political project that is the EU, the truth is that the EU migrants I know have been mulling over whether to stay in the UK or not for a few years now. This has been ever since UKIP gained MPs in the House of Commons, one of whom was briefly the MP in my constituency.  With Brexit now a distinct possibility, one colleague, a Dutch national, is virtually resigned to returning to the Netherlands if it occurs because Dutch citizens are not allowed to adopt dual nationality.  To continue to work in the UK post-Brexit would require a work permit of some kind, and all of the hassle and uncertainty that goes along with it.  Others have cited likely restrictions on work and travel for them and their family members as reasons to leave.

But I’m now going to avoid calling them EU migrants. Instead I’m going to call them what they really are: family and friends.  My niece and nephew are both Italian nationals, and I have French, Belgian, Finnish, German, Italian, Greek, Dutch and Polish colleagues in my institution and elsewhere whose lives will be seriously compromised, if not destroyed entirely, in the event of Britain leaving the EU.  These are Brilliant, talented people who deserve far better than to be the collateral damage in what is, and always was, the Conservative Party’s internecine pet project.

So yes, I want my country back. A country that is welcoming, tolerant and outward looking. And to get that country back I’m voting Remain.

Dear Alex: Conversations on the way to school

May 31, 2016

This post is based on a genuine conversation I had on a walk to school with my son (he is 4). I’m posting it here because I think it is as good an advert for having children as anything. Children are amazingly original thinkers even if their view of reality is not always (if ever) quite right. I hope my son keeps his current enthusiasm for finding things out.  Here goes:

Dear Alex:

Our trip to Madame Tussauds clearly made quite an impression on you, judging by our conversation on the walk to school. I feel I need to clear a few things up:

The guy with the feather was William Shakespeare. The guy who looked like a scarecrow in the bit that smelled of poo was a plague doctor. The plague is not “The Black of Death” as you keep saying (you’re close though), and you can’t get it because you sneeze a lot. We have drugs and sanitation now, so doctors don’t need to dress like scarecrows.

The guy with the eye patch and the “pirate hat” was Lord Nelson. He has an eye patch because he hurt his eye in the Battle of the Nile. He is NOT a pirate. In answer to your other question, I don’t know what his friends called him but it was probably something like “Horatio”. He definitely wasn’t a pirate. I know that I wasn’t born when he was alive, but that does not increase the probability that he might have been a pirate just because you think he was. And you don’t win this argument by saying that you know he was a pirate because “Lord Nelson is my brother’s dad”. Even if you had a brother, this would be chronologically and biologically implausible. Whilst he has an eye patch and a hat that makes him look like a pirate HE IS NOT A PIRATE. Nelson has a column too, but that doesn’t make him a journalist.

If you want to continue this line of reasoning, I am quite happy to set up an anonymous Twitter account and sign you up to a few internet forums. You’d excel at trolling.


“This oxygen uptake value doesn’t look right”: on interpreting exercise tests in athletes

December 15, 2015

This post is inspired by a discussion between Antoine Vayer, Jeroen Swart and myself a few days ago on Twitter.  Vayer is a former coach of the Festina cycling team, and a strong advocate of interpreting power output data in the context of doping.  Swart is an exercise physiologist and sports physician who has been involved in the testing of Chris Froome at GSK (or embroiled in the testing, depending on how you look at it – the social media reaction has been quite something.  I can’t put my finger on quite what kind of something it’s been, but it’s something nonetheless).  I noticed that Vayer had put up some data from an incremental cycling test, with the following challenge to “experts”:  “Game 1/10 for experts from Lisbeth ! Who got this VO2 [oxygen uptake] >91 ml/mn/kg ? Is he a cheater or not ? Is it possible ?”.  Now, I consider myself something of an expert here.  In fact, I’d say that the number of people in the world who understand the VO2 response to exercise better than me could comfortably fit in a double-decker bus, and some of them are dead.  So I had a quick look at the data.

The maximal oxygen uptake (VO2max) value was recorded during an incremental test in which it appears that the athlete exercised for about 4 minutes each stage and rested between stages, with gas exchange data recorded every 30 seconds.  The 91 mL/kg/min VO2max was recorded towards the end of the penultimate stage displayed in Vayer’s tweet.  But there was something odd about it. The VO2, in absolute terms was 6.29 L/min, but this value was achieved at a power output of 425 W.  Any exercise physiologist faced with a high VO2 will naturally enquire about the power at which it was achieved, or if faced with a high power output, will enquire about the VO2 achieved.  The first thing you learn (or should learn) when analysing test data like this is to ask the question “does it look right?”.  If the data deviate wildly from what is considered normal, it’s probably wrong for some reason.  This works for athletic data as well as any other because there are robust relationships between VO2 and power output.  Vayer’s data make no physiological sense.

The VO2-power output relationship follows this rule-of-thumb: for every watt of power produced, you consume about 10 mL of oxygen in the steady state (give or take 1 mL/min/W).  This makes for a very handy error detector in the lab.  This works by taking any absolute VO2 value (in this case 6.29 L/min), subtracting ~0.80-1.00 L/min to account for the O2 cost of pedalling, and dividing the answer by power output.  In this case, 6290-1000 = 5290/425 = 12.4 mL/min/W (this figure is known as the “gain” for VO2).  In other words, there appears to be a very large error in VO2 of about 1.0 L/min.  The actual VO2 that should be associated with a power output of 425 W is ~5.25 L/min (assuming a baseline pedalling VO2 of 1.0 L/min, which at a cadence of 92 rpm seems reasonable).  It may even be lower than that in an efficient athlete. Jeroen Swart estimated 4.95 ± 0.15 L/min VO2 for the same power.

By way of contrast, Chris Froome’s recent data produces a gain of ~9.4 mL/min/W (5.91 L/min, less 1 L/min, so 4910/525 = 9.4).  This is a normal VO2 gain, and may even be an underestimate given that this was measured during a ramp test, in contrast to the 4 minute stages used by Vayer (quite correctly if steady state VO2 was an important variable to measure).  In Froome’s case, the 30 W/min ramp is non-steady state and VO2 will lag power output.  Additionally, if Froome had a plateau in VO2 (or was approaching one), this would have reduced the gain still further.  Thus, it is reasonable to suppose that his steady-state VO2 gain would be higher than 9.4 mL/min/W. In all likelihood, it would be very close to 10 mL/min/W.  Why, then, is the VO2 recorded in the test in question so high?  There are four possibilities: 1) an extravagant metabolic response by the cyclist; 2) an error in standardisation of ambient conditions; 3) an ergometer calibration error, or 4) an error in the flow or oxygen sensor and/or calibration.

Extravagant metabolism

It is possible that the relationship between VO2 and power output can be altered by exercising at high intensities for prolonged periods.  This increases the VO2 gain, and values >12 mL/min/W have been observed in the literature.  So the “slow component of oxygen uptake” could drive the VO2 above the predicted steady state value and increase the gain.  I have done a bit of work on the slow component, and I think it may be a factor in this test.  It is, however, unlikely to explain the VO2 being 1.0 L/min above expected.  This is because, as its name suggests, the slow component takes time to express itself.  It takes 90-120 to emerge from the “normal” or “fast component” of the response, and if you fit a curve to it, it has a time constant of at least 200 seconds.  It therefore takes many minutes to develop, and in this test many minutes we do not have (nor does the VO2 does seem to be systematically rising in the 425 W stage).  It is also unlikely that previous test stages generated much slow component behaviour.  This is again partly due to their length, and partly due to the modest blood lactate concentrations measured (2.1 mM at 375 W).  The slow component only develops above the lactate threshold, and in the 375 W stage the cyclist is only just above it.  Any slow component would be small even if the 375 W stage was continued for 6-8 minutes.  Finally, the denominator of the VO2 gain equation is against us: the largest slow components recorded in the literature are for exhaustive exercise bouts lasting 10-15 minutes, and they can exceed 1 L/min.  To achieve this in the time required in this stage would need VO2 to rise at an unbelievably swift rate, and we just don’t see this happening.  So, whilst the slow component of VO2 is real and significant, it does not appear to be an explanation for the high VO2 values reported.

Ambient conditions and gas exchange calculations

Pulmonary gas exchange variables must be corrected from ambient temperatures and pressures to standardised temperatures before interpretation.  These days using automated systems these calculations are done automatically, but sometimes input is needed as part of routine calibration.  For ventilation, values are expressed BTPS – Body Temperature and Pressure, Saturated – because the true volume exhaled at the time matters.  For VO2 (and VCO2), the correction provides Standard Temperature and Pressure for Dry gas (STPD).  The only way to introduce error here is to fail to change the ambient temperature settings prior to analysis.  However, a 10°C error in temperature (unheard of in a well-ventilated or an air-conditioned laboratory) would change VO2 by no more than ~0.3 L/min.  A similar error would occur if you got barometric pressure wrong by 30 mmHg.  Thus, it is unlikely that a 1.0 L/min error would be introduced here.

Ergometer calibration

A cycle ergometer that is not calibrated can produce very strange VO2 responses.  I vividly remember taking delivery of an ergometer that had not been stored correctly and its flywheel was off by a few millimetres.  Being electrically-braked, a few millimetres is huge, and I was exhausted at a power output 150 W lower than normal with an (apparently) enormous VO2 gain.  But my VO2max was in the normal range, so it didn’t seem to be the gas analyser, and it just felt wrong.  In the case in hand, it is the VO2 that seems too high, not the power.  And an elite cyclist would also very quickly realise if the ergometer was out. As an example, I once started a treadmill test on an athlete who got quite animated about how hard the 10 km/h warm-up felt.  It turned out that somebody had changed the settings to miles per hour!  The ergometer is not likely to be the problem in this case.

The gas analyser

The origin of the error can be limited to the gas analysis system.  I am not sure whether the Oxycon system being used in this test (as Vayer told me it was) was being used in a mixing chamber mode or breath-by-breath, but it seems that the ventilatory variables look normal: minute ventilation is not astoundingly high for an athlete producing 90 mL/kg/min (if anything it is too low).  Indeed, the VE/VO2 ratio at maximal exercise is usually >35 in my experience.  Here it is ~32.  Not that low, but low all the same.  In short, the flow sensor or ventilatory volume measurement is not a strong contender for the extra litre/min of VO2.

This leaves us with the O2 sensor itself.  The origin of the error is impossible to pin-down without knowing the precise specifications of the analyser (there are a number of Oxycon models, some which use fuel cell type analysers, others paramagnetic sensors), but it is possible that an electrochemical fuel cell analyser had reached the end of its life at the time of the test and started reading high. Alternatively, a calibration error resulting in incorrect zero and/or span calibration could have caused a systematic error in VO2.  It is important to state that this error is not peculiar to the 425 W stage: the 375 W stage preceding it produces gain values of around 12 mL/min/W , so this error is evident throughout the test [in a previous edit, I said 14.5 mL/min/W – I’d forgotten to subtract baseline VO2 in the calculation.  Mistakes are easy to make with gas exchange data!].  A calibration error on one of the calibration points would amplify the erroneous gain at lower power outputs, wherein the expired O2 fraction (FEO2) would be lower than during maximal exercise (that is, the error will get larger or smaller as FEO2 falls away from 20.95%).  That we are not seeing this means that the whole calibration curve is systematically in error.  The cause of this (analyser ‘drift’ or fuel cell end-of-life performance) is impossible to call without data from the calibration procedure itself.

In conclusion, I’m overthinking this.

Or, to put it another way, the issues above illustrate why physiological testing is unlikely to ever be a major pillar of anti-doping efforts: there are too many sources of error, as well as too much variation in testing protocols, the equipment and ergometers used between labs.  Anybody who has attended a conference on sports physiology will appreciate that there are almost as many measures of “threshold” as there are people working in the field.  Getting scientists and practitioners to agree measurement standards seems a very long way off.  Even if such standards could be agreed, there are no clear physiological “red lines” above which doping can be inferred.  This is because athletes, like all humans, occupy a normal distribution of physiological function.  More correctly, the parameters of endurance performance (be they physiological, biomechanical, psychological) are all normally distributed, and the sum of these distributions makes the athlete who they are.  Doping can and does shift some of those curves, but from where to where? For specific individuals we simply don’t know most of the time, and until we are sure a change in physiological test results are not due to errors we inadvertently introduce, we never will know.


So with that, Merry Christmas…

Reigning Blood: Thoughts on the Scandal of Blood Doping in Athletics

August 22, 2015

I was asked if I’d write a thing about doping for a website, so I did but the editor never got back to me.  Here is what I wrote:

This year the World Athletics Championships is being held in Beijing. But there is no doubt that this event is clouded by a doping scandal that threatens to seriously damage, if not destroy, the sport’s credibility.  The scandal was the result of two news reports.  The first concerned the allegation that a high proportion of Russian athletes were involved in systematic doping.  The second was the Sunday Times/ARD programme and publication in the first weekend of August which leaked extracts from an historic blood profile database held by the International Amateur Athletics Federation (IAAF). When these profiles were analysed by experts in biological passport analysis, the results suggested that as many as 1 in 7 athletes from 2001 to 2012 had returned “suspicious” values, tainting many major championships, including the 2012 Olympic Games.  Clean athletes have lost medals, failed to make finals, and lost funding because they were out-competed by athletes who were doping.

The reaction to the leak has itself been revealing: the leaders of the IAAF have come out fighting against the findings and denying any wrongdoing; the experts concerned have countered these arguments. Finally, social media platforms have been ablaze with comment and speculation about which athletics star will be implicated next.  This scandal can, I think, be better understood if the scientific basis of the allegations is understood. Furthermore, the reaction of those at the top of sport and anti-doping, as well as the public reaction to the scandal, tells us a lot about how a future scandal such as this might be avoided.

What does “a suspicious profile” actually mean?

The athlete biological passport (ABP) is a method used to detect doping, not by finding a drug in blood or urine samples, but by finding evidence of doping in biomarkers such as haemoglobin concentration.  The ABP works by an individual athlete regularly giving blood samples, from which a profile is generated with individual limits (“probability thresholds”, currently 99%) set to identify abnormally high or low values. A “suspicious” profile is one which exceeds these limits at some point or points, or which contains deviations or fluctuations that are identifiably abnormal (these are known as an “Atypical Passport Findings”). An expert reviews the profile and uses other data (if available) to confirm that it is not likely to be due to pathology or specific circumstances that might affect blood sampling (e.g., illness).  If those other data are not available, the result remains “suspicious” and follow-up data collection and additional testing can be performed.  The “suspicious” profiles noted by the Sunday Times/ARD experts were those exceeding a 99% probability threshold.  This is consistent with the current WADA guidelines.

The reason the “suspicious” profiles are not direct evidence of doping is that the ABP requires further review for this to occur. A suspicious result warrants further review by two independent experts if the first reviewer has ruled out normal physiology and pathology. Only if the three reviewers unanimously agree that the use of a prohibited substance or method is highly likely and illness or other factors are unlikely to account for the results can an “adverse passport finding” result be declared, which then initiates disciplinary procedures against the athlete.

It is worth pausing here briefly to make the point that ABP findings are considered “atypical” and “suspicious” when there is not enough evidence to be certain that things other than doping could explain the finding. Results only become “adverse” (that is, the athlete can be sanctioned) with further evidence and review.  The data on which the scandal is based are atypical, NOT adverse findings.

The rigour with which the ABP is administered and managed makes it sounds perfect, but it isn’t.  As with standard doping control tests, the ABP is designed specifically to avoid false positives (that is, you want to go out of your way to avoid sanctioning a clean athlete).  This means that the false negative rate is high. In other words, cheats can and do avoid sanction (more on this later).

The issue the experts in the Sunday Times/ARD story took with the IAAF was that suspicious findings did not seem to be followed up.  The IAAF did have procedures in place to do so.  The allegation that they did not is, therefore, very serious. I am in no position to say whether this is true or not, but on other points of the story I have some sympathy for the IAAF.  Much of the focus in the original story was on the World Championships in Helsinki in 2005.  But the sanctioning of athletes using the ABP did not come into force in athletics until 2009.  Judging the actions of the IAAF by the standards of today seems a little unfair.  That said, Dr Michael Ashenden, one of the experts consulted and an architect of the ABP itself, makes a compelling argument that the IAAF could and should have done more at the time in any case.

The good, the bad, and the ugly

The reaction the scandal has been an interesting exercise in sports politics and the power of social media.  WADA stated that it was shocked and within a week had announced an investigation into the allegations.  The full scope of this investigation is not clear, but it is likely to focus, in part, on how the data were disclosed rather than how the IAAF responded to the blood profiles pre-2009. Lord Coe, now president of the IAAF, came out fighting, suggesting that war had been declared on athletics, and dismissing the analysis of the “so-called experts”.  This was likely to be part of his successful attempt to get elected to the presidency, but it was also an exercise in how not to manage a crisis.  The experts he maligned really were experts in this particular field.  They have diagnosed that his sport is sick, and he needs to listen to their advice.  More broadly, in the post-Armstrong era, having an international federation “circle its wagons” in this way was bound to lead to accusations of cover-up, whether true or not.

The IAAF’s bullish reaction to the scandal led to commentators on social media in particular savaging the IAAF.  This is understandable, given that these commentators are already extremely hostile to any official narratives in the wake of the Lance Armstrong case and the role the UCI played in it.  They were also primed by a Tour de France in which suspicion of the CG contenders grew exponentially in spite of attempts to prove innocence. It was to be expected that they would turn their guns on athletics, and several high-profile athletes have been the subject of intense speculation as to whether their ABP profiles are suspicious.  Some of that commentary was thoughtful, detailed, and nuanced, but even the most cursory search on Twitter shows many high-profile athletes being very seriously libelled with doping allegations.

Given recent cases, such as McAlpine vs. Bercow, naming names without evidence on social media is extremely ill-advised.  The acid test here is: has an athlete been mentioned by name? Has doping guilt been implied, even indirectly? Does the athlete in question have an adverse ABP finding? If the answer to the first two questions is “yes”, and the answer to last question is “no”, then the athlete has been libelled, in the UK at least.

How should the IAAF deal with this in the future?

It is important to remember that the goal of anti-doping is to protect clean athletes.  This is difficult because proving a negative in systems shrouded in secrecy is almost impossible.  Some form of transparency that federations and athletes can buy into would be a start.  Some athletes have chosen to make their passport data public, whereas others have not given their consent. This is already leading to assumptions that lack of disclosure means that the athletes in question have something to hide.  This might be true, but it may also mean that the athletes possess passport profiles that contain atypical findings that are explicable, but they are concerned that key contextual details may be lost or simply dismissed.  It is also possible that the explanation for atypical contains information of such a personal nature that the athlete does not want any of the information in the public domain.  All that considered, the IAAF and WADA urgently need to build transparency into their systems.  One way would be to periodically, or on request, perform a full three expert review to give an athlete a clean bill of health (or otherwise), and publish the resulting ABP Document Package with the athlete’s consent.

Finally, there is the future of anti-doping per se.  If we have learnt anything from the Armstrong case, it is that analytical anti-doping measures will only go so far.  What did for Armstrong was tenacious investigative journalism and equally tenacious law enforcement input, followed by a forensic investigation by USADA.  In this regard, catching cheats isn’t just about blood.  If anti-doping policy was a pheasant shoot, doping controls and the ABP would be akin to beating the ground and sending in the dogs: to be successful, you still need the people with guns in the end.  WADA and the sports federations covered by the code need to use all necessary means to catch doping cheats, because the current systems are not protecting clean athletes in the way they should.

Data transparency in cycling: necessary, utopian, and a complete can of worms

July 19, 2015

This year’s Tour de France is developing into a bit of a split race, being both exciting by stage and predictable by General Classification (GC).  This was most clearly demonstrated by the blistering performance of yesterday’s stage winner Steve Cummings of MTN-Qhubeka (the African team’s first stage win, on Mandela Day, no less), followed by Chris Froome hoovering up all attacks against him.  It was an eventful ride for Team Sky, with fists, saliva and urine apparently being thrown at them.  They are currently the sport’s bad guys, for no reason other than dominance.  The last team to dominate like Sky did was one of the liveries led by Lance Armstrong, and Sky’s tactics and public relations stance continue to draw uncomfortable parallels with the Armstrong era.  This suspicion has led to calls for Sky (and others) to be more transparent about their power data in particular, since the view goes that teams with nothing to hide should hide nothing.

Something something Armstrong, something something Froome. Right, let’s SCIENCE… [Forget personalities, there’s a link in two paragraphs time in which the awesome David Wilkie uses very simple power modelling to make a bicycle fly.]

Power output and the physiological response to exercise

Mountain stages in the Tour are critical to success.  One bad day in the mountains can cost you the race, and a good day can get you a Yellow Jersey.  In contrast, sprint stages rarely produce gaps in the GC, and time trial stages are predictable and (to within a minute or so) run to form.  That’s problematic if you’re on the wrong side of the minute, but not fatal.  Gaps of over 5 minutes are sometimes seen in the mountains.  In a mountain climb, where air resistance plays a more limited role, the rider who can sustain the highest power, and thus (bike and body mass accounted for) the highest speed for the duration of the all the climbs, is likely to win the Tour.  Time trial specialists cannot win the Tour with time trials alone.  They must train to climb (contrast Boardman’s Tour performances with those of Wiggins – riders with very similar initial backgrounds but very different training approaches to the road).

Because sustaining a high power output on a climb is crucial, there has been a great deal written about the limits of what is possible.  I am not going to add to this debate, as in my view without direct measurement of power as well as an understanding of the rider’s physiological capacities (aerobic and anaerobic) there are too many assumptions to be sure that a conclusion about whether something is possible or not can be drawn.  There are performances that might look suspicious, but a 4 min mile performance in running would have looked suspicious in 1935.  By 1995 it was considered slow.  We do, however, know a few things about what determines sustained power, thanks to scientists like AV Hill, David Wilkie, and a number of others.

The physiological response to exercise depends on the power output you produce.  For “moderate” exercise, muscle oxygen uptake rises rapidly and reaches a steady state.  Blood lactate either does not rise or rises only transiently.  At these work rates, exercise can be sustained for many hours.  For “heavy exercise”, when you exceed the lactate threshold, oxygen uptake takes longer to stabilise and does so at a higher value than you would predict from steady state responses to moderate exercise (in other words, you are less efficient).  This is the result of a “slow component” of the oxygen uptake response that develops after about 2 minutes of exercise and stabilises after 15-20 minutes.  In the heavy domain, exercise can be sustained for between 45-60 min and about 3-4 hours.  For “severe” or “high-intensity” exercise, the oxygen uptake slow component does not stabilise (and nor does any other metabolic response) until maximal oxygen uptake (VO2max) is attained.  Exhaustion inevitably follows soon after this occurs. The “severe-intensity domain” commences when you exceed the critical power (CP).  The CP, in turn, represents the asymptote of the power-duration relationship, first noted by AV Hill in 1925.  We’ve written a few papers about these concepts, which you can find here (free) and here (not free).  The power-duration relationship can be defined by as few as two parameters, namely the CP and a parameter to define the shape of the curve, denoted W’.  The CP is thought to reflect the power of the aerobic systems of energy delivery and the W’ is thought to reflect the “anaerobic capacity”, although we know this is a little simplistic. It is the power-duration relationship that is important for working out what is and is not possible when cycling up a hill.

Defining possible and impossible

If you know the values of CP and W’, and you know the power demand of a task, you can make a clear prediction about what the time limit of the task is. The equation, for those who want it, is:

Time limit = W’/(power output – CP)

The problem is that the above parameter values will vary between athletes and will vary day to day.  The parameters and the underlying physiology that determines them can also be influenced by various acute interventions (like glycogen depletion, for example), which adds further uncertainty to any “back of the envelope” calculations that you might wish to make. To know if any performance is abnormal, you need to know what the power-duration parameter values actually are. Consider that in an elite cyclist with GC ambitions might have a CP of about 380-440 W, and a W’ of 20-30 kJ, both of which will depend, to some extent, on body mass.  This means that to complete an effort lasting 40 minutes, with a W’ of 25 kJ and a CP of 420 W, the “normal” power output sustainable for this duration would be 430 W, or 6.1 W/kg for a 70 kg rider.  Notice here that the contribution of the curvature constant to long duration efforts is quite small (about an extra 10 W over 40 min) and thus the most crucial determinant of mountain performance is the maximal sustainable power output, CP.

One reason why I don’t think fixating on a particular W/kg value as “possible” or “suspicious” really works is that it all depends on the value of CP.  This value is unknown and variable!  Obviously, for a 70 kg rider to sustain 6.1 W/kg without drawing on the W¢, CP would need to be at least 430 W.  I don’t think that is unreasonable, given previously documented hour record performances and the power outputs produced during them (Bassett et al., 1999).  To sustain 430 W would require an oxygen uptake of approximately 5.3 L/min (O2 cost of ~10 mL/min/W, plus ~1 L/min for the O2 cost of spinning the legs at 90-100 rpm), which, if capable of utilising ~90% of VO2max, would predict a VO2max of 5.8 L/min or 84 mL/kg/min.  This is high, but certainly not unheard of.  Sustaining 85% of these figures would require a VO2max of 89 mL/kg/min.  That is still not impossible.  And this is all assuming a normal mechanical efficiency.  Efficiency would decrease due to the development of a slow component of oxygen uptake, but this would add no more than about 200 mL/min to the tally (that is, oxygen uptake would remain submaximal even with this factored in).

Knowing the possible

The above calculations are hypotheticals based on reasonable estimates.  The numbers accompanying Froome’s (or Nibali’s or Contador’s or…) that appear on the internet are just as hypothetical.  In short, we have no direct numbers for either physiological capacity or performance for GC riders at the time of the Tour.  Values estimated from those recorded in other parts of the season are likely to underestimate the capacity of a rider who has peaked and ridden conservatively in much of the first week of racing.  In addition, direct measures of rolling resistance, wind speed, temperature, altitude, and so on, are also absent.  To know what’s possible would require direct power-duration measurements from Froome immediately before the Tour, as well as calibrated power data during each and every stage.  It is likely that Sky possess both data sets.  They most likely have a variety of physiological measures that could corroborate the power-duration data (i.e., the VO2max, efficiency and LT data would likely fit in the same general picture).  But they refuse to place these data in the public domain.  Should they?  The scientist in me says yes.  The sports fan in me says maybe.  The pragmatist in me says that there is next to no chance of these contemporary data ever seeing the light of day.

For one thing, Froome’s consent would be needed to release these data, and even if that consent was given, where would the data be stored and how would access be gained?  If Froome releases his, every rider in the Peloton should be obliged to release theirs, lest there be any accusations of unfair treatment.  The teams are highly unlikely to want to do this for competitive reasons.  It’s much the same reason why Formula 1 teams do not release telemetry data in real time – a good rival engineer would identify engine modes, brake balance, tire wear etc and use that information to the team’s advantage.  The sport would become even more about who has the best support crew rather than the best performer.

A less problematic point is that not all teams use the same power-measuring devices.  Moreover, where on the bike the power is measured also matters.  Although often measured at the crank or the rear wheel hub, it’s the power transferred to the road that counts (producing forward propulsion), but the power actually produced on the pedals that costs (in terms of physiological demand).  Frictional losses and rolling resistance (though presumably minimised) will also differ, adding errors to any calculations of who produced watt, where and when…

The Future

There has been some chatter on Twitter and elsewhere of power files from races being used as part of the Athlete’s Biological Passport in cycling.  I can see some merit in this, as within each athlete, performances can be compared to their power-duration relationship, their physiology and their blood parameters already used.  The observation of abnormal power outputs alongside sudden changes in, for example, the Off-score, might trigger closer scrutiny of that athlete in the coming months.

Finally, I can see the potential for power data from grand tours being released following an agreed embargo period.  This would serve an educational and scientific purpose of providing a rich seam of data to be used by anybody who wanted it.  Those data could also be used as part of a retrospective anti-doping case.  But they’d only ever be part of the story.  If there was reasonable circumstantial evidence of doping in the absence of a positive test (like the Armstrong case, for the most part), then power files could weight to that case.  But it would only be small given the number of variables involved in ultimately producing power output.

I’ve almost certainly not done this issue justice, but the above thoughts lead me to conclude that the question of data transparency in cycling and what its potential uses are does not have any easy answers.


The contentious and irrelevant bit:

Armstrong’s shadow

The similarity between Armstrong-era cycling and today ends with what is written above.  Quite a few people have asked David Walsh, the man who was instrumental in taking down Armstrong, why he is not asking Sky and Froome tough questions.  I personally think that is wrong-headed.  Armstrong’s transformation post-cancer was mind-blowing, whereas Froome’s ascent has been more incremental.  Add to that the accumulation of damning evidence throughout Armstrong’s career, covered up by Armstrong with the help of the UCI, and in that case Walsh could ask questions about tangible things in Armstrong’s closet.  Froome’s closet is bare by comparison, save for a TUE and some stunning performances on the road.  So there was good reason to pursue Armstrong, but much less to pin on Froome and Sky.  This is why calls for data transparency are timely.

[EDIT: I’ve got quite a few comments about the LA/CF comparison I made above both here and elsewhere.  I’ve a good mind to delete it because I think it detracts from the point I’m trying to make (that power output is interpretable in context but it’s always likely to be very difficult).  I’m not going to delete it, however, because it would make the post more boring, and writing really bloody boring stuff is something I’m already pretty good at.  My point to those arguing over this is that (like the rest of the post, actually) context is everything.  We know most of the details surrounding Armstrong’s “inverted U-shaped” career progression, in which he went from an aggressive stage racer/breakaway specialist to cancer patient to GC domination.  In this, he went from not really being notable as a climber to dropping Pantani. That’s astonishing.  Froome burst onto the scene in late 2011, but had been with Sky for about 18 months before that, and showed some potential between bouts of illness and injury.  The impact of these is not certain because, again, of the context. Sky, due to its links with British Cycling, was and still is awash with riders who are good against the clock, so he wasn’t in the team to be the time triallist.  He was clearly there to be part of the Sky train, in a team hell-bent on GC success with Bradley Wiggins.  In that context, Froome’s rise to prominence was not particularly fast, but was perhaps unexpected given the circumstances at the time. Importantly, he didn’t completely change his style of riding to break through.  I am not naïve enough to appreciate that there aren’t other explanations, but, again that’s not what the post is about, and I’ve tried to avoid drawing any of those conclusions. Back to the day job…]