21 July 2011

Back under 5% for Stage 18!

Here is the Stage 18 result:

  • Stage 18:  6h 07' 56" (actual), 5h 51' 23" (prediction), -16' 33" slow (-4.50% error)
I am happy to be under 5% again!  Did the poor weather slow riders down a little?  If so, I feel even better about my prediction because my model did not include adverse weather for Stage 18.  Then again, it rained on Stage 16, and that one remains a mystery to me.

The Schleck brothers got it done today.  Here is Andy Schleck's average speed:
  • Stage 18:  9.08 m/s (20.3 mph)
Thomas Voeckler was able to hold on to the yellow jersey.  Tomorrow's stage, which is the final mountain stage, might just decide this year's winner.  Here is my prediction:
  • Stage 19:  2h 57' 54" (prediction)
Stage 19 should be incredible!  Riders climb to an elevation of 2.556 km (1.588 miles) as they reach Col du Galibier in the French Alps at the 48.5-km (30.1-mile) mark.  The next 46 km (28.6 miles) will then make for a great downhill.  The final 15 km (9.32 miles) will have riders climbing a 7.9% grade to get to the famous ski village at L'Alpe d'Huez.  In 2004, Lance Armstrong dominated that climb in that year's Stage 16, which was an individual time trial.  Armstrong beat the second-place finisher, Jan Ullrich, by just over a minute.  That stage sticks out in my mind because I cover it in detail in Chapter 4 of my book.

If weather does not slow riders down, I hope to see the winner's time sneak under three hours.


  1. I'm still wondering about the power values. You said "seeks to find the winning stage time, which means using the top power outputs". The tour tracker only shows the top few riders so any riders that appear on it are likely to be close to the top power outputs.

    I found the paper "Exercise intensity and load during mass-start stage races in professional road cycling" PADILLA, S. et. al. http://edulife.com.br/dados%5CArtigos%5CEducacao%20Fisica%5CFisiologia%20do%20Exercicio%5CExercise%20intensity%20and%20load%20during%20mass-start.pdf . They estimated power output in flat (FLAT), Semi mountainous (SEMO) and high-mountain (HIMO) terrain using heart rate data rather than direct power measurement.

    Estimated power output was:
    FLAT (192 +/- 45 W)
    SEMO (234 +/- 43 W)
    HIMO (246 +/- 44 W)

    You power data is:-
    Pb =
    200 W (θ < −0.055)
    325 W (−0.055 < θ < 0.062)
    425 W (0.062 < θ < 0.09)
    500 W (0.09 < θ).

    So the flat is similar but your steep terrain values are much higher than Padilla et. al. The tour tracker observed values were:
    253W - Andy Zeits on the last hill
    291W seen for a while max seen for a moment 474W - Irizar on the first climb.

    These values are also close to Padilla et. al. i.e 246 +/- 44 W

  2. Ken: Thanks for your continued interest in the power issue. The overwhelming majority of the angles in the profile data are such that the power is in the low to intermediate range. There are just a few inclines that require the largest powers. By the way, the power data you give for me come from my last paper on the Tour de France, which described the model I used for the 2004 race. For this year's race, the angular cutoffs are slightly different, and several stages have 400 W instead of 425 W. I have used a slightly more complicated power model this time around.

    I also point out that estimating power output for a given rider on a given stage of the Tour de France is challenging. For example, look here:


    A colleague alerted me to that blog just today. You will see power estimates consistent with what I have been using.

    I will do some more searching and see if I can find others who do power estimates.

  3. Some more data from Physiological and Performance Characteristics of Male Professional Road Cyclists, Mujika I.; Padilla S. http://www.ingentaconnect.com/content/adis/smd/2001/00000031/00000007/art00003

    Mass-start stages are raced at low mean intensities (210W for the flat stages, 270W for the high mountain stages), but are characterised by their intermittent nature, with cyclists spending on average 30 to 100 minutes at, and above 370 to 390W, and 5 to 20 minutes at, and above 400 to 420W.

    Again these numbers are consistent with the tour tracker observations except for 474W which may have been momentary or a glitch.

    The intermittent nature mentioned might be important for your model because you match power to grade and even if the average is right the particular grade the intermittent high power sections occur on will effect your prediction.

  4. That is a great link you provided. It also links to the SRM live site which must be where the tour tracker is getting its data from. I wondered where the tracker was getting its data and I'm pleased to have discovered it.

    There is all the data you could ever want at SRM live so if that data is consistent with your model it must be right. There is so much detail here you could refine your model further if desired.

  5. Ken: Thanks for the additional information. The power inputs I use are certainly consistent with what you describe. My code needs a trigger to change the power, and that trigger is usually the change in grade. There are only two or three angular segments that require the absolute top power. My model, like any model of the race, is obviously more simplistic than reality. What models like mine often employ are research-based estimates of necessary parameters, which means there is a bit of averaging that takes place. There are too many aspects of reality to adequately model, like when the riders eat, relieve themselves, move in and out of the peloton, etc. Racing psychology and strategy are also very difficult to incorporate into a model, especially on that tries to predict the future.

    When my former student, Ben Hannas, and I first set out to model the Tour de France in 2003, our goal was to create a model that incorporated much of reality, but remained simple enough that it could be used easily for any tour stage. There is always a compromise between simplicity and complexity when creating a model. My goal now is to use a model to predict the stage-winning time. Once a stage is completed, I can look at what others have measured for power and check the validity of my model. From what I've seen so far, the power inputs I use are reasonable.