The most engaging and illuminating conversations can come from how big data is providing insights and changing activities we are passionate about.  We experienced this first hand while recording our podcast and video highlighting how data analytics is transforming user experience at Squaw Valley and our podcast featuring how Splunkers are uniquely applying analytics to dating.  (Be sure to check those out if you have not already.)  In addition to skiing and snowboarding, my sons and I are passionate about cycling and triathlon.  Innovations in cycling training driven by data and technology are in large part responsible for my kids’ enthusiasm.  In the cycling industry advanced end user applications, analytics, simulation, and high-performance computing (HPC) are helping transform these sports immensely at all levels.  In 2013 the Canadian bike company Cervelo set out to design the ultimate triathlon bicycle using data, creating the Cervelo P5X.  The English newspaper the Guardian said it best: “If you were to weaponize a bicycle, it would look like this…[i]t looks like Batman’s ultimate ride.”  By dispensing with convention and following the data Cervelo engineering changed typical bicycle design priorities and created arguably the fastest triathlon super bike.

In developing the P5x Cervelo made the conscious decision to eliminate subjective bias and design precedent and examine the data.  The goal was to create the best and fastest triathlon bike on the market from scratch, free from prevailing convention and International Cyclist Union (UCI) rules.  Cervelo spent more than a year developing a terabyte of data to analyze.

This data included:

  • 14,500 photos
  • Interviews of athletes, coaches, fitters, and bike dealers
  • Test rides in a range of conditions


  • 2-3 water bottles are carried by most riders on their bikes
  • 1,600 calories in additional nutrition is carried by the average age-group competitor in Ironman
  • No significant commonality in rider bike configurations for nutrition and hydration, all are unique and highly personalized.
    • The most common configuration accounted for only 3.8% of riders.
  • Capacity for round bottles preferred despite aerodynamics to allow racers to match what is generally available to swap
  • Seat tube not necessary for strength and design

The result of these insights led to new design considerations focusing on configuration flexibility, fit efficiency, and storage which reportedly came to direct the design as much as structure, aerodynamics, and weight.  Making these objectives a top priority is a change in basic assumptions.  For most manufacturers, and especially riders, the ability to shave weight is paramount.  Watts per kilogram by any means necessary.  Despite being an over 200-pound triathlete, five hundred dollars for an oversized pulley wheel system?  I have considered it.  We are buying speed and we are somewhat presupposed that is synonymous with shedding weight.  The fact of the matter is the P5x is slightly heavier (13%) than its benchmark brother the Cervelo P5.

So, what does the data tell us as to why user interface aspects of design like configuration and fit are as important as the actual mechanics of the bicycle?  Well, 80-85% of my energy output is used to overcome resistance, mostly drag.  Here is the thing: 75-80% of that resistance is me, not the bike.  Yup, I am a big drag.  More important than the aerodynamics of he bike itself is the ability of the bike to place me in a comfortable and efficient position that minimizes drag for an extended period.  Speaking from personal experience, if you must sit tall in the saddle because your butt hurts, your junk goes numb, or your muscles ache, 13% weight delta on the bike counts for nothing.  What good is an extra 5 watts from a component if you are losing 15 to deficiencies in body mechanics.  To address this the design for the P5x incorporates the ability to make broad adjustments easily.  Interestingly enough, this has also led to increased simplicity for sizing and I would presume a reduced variance and complexity in manufacturing, forecasting, and inventory management.  Why do I postulate these tangential business improvements in addition to the direct performance improvements?  Because, where other triathlon bikes like the P5 have six frame sizes varying from 48 to 61cm, by contrast the P5x has only 4 (Small through Extra Large).

The P5x is not the only data driven research and development Cervelo has undertaken.  Seems like every organization has a “Project California” somewhere in their history.  My former employer and Cervelo are no different.  Cervelo began its project California in 2008 testing composites, manufacturing techniques, and design processes.  The current pinnacle of this design effort is the Cervleo RCa road frame.  Cervelo has published that in developing the RCa they undertook 279 finite element analyses (FEA).   Finite element analysis (FEA) is a computational method for testing how something reacts to real-world elements.  Before the first physical frame was molded, Cervelo produced 93 virtual frames through a process they refer to as CASE, Concurrent Aero and Structural Engineering.  This process involved simulations and computational fluid dynamics to determine aerodynamic characteristics.

From a technological and data analytics perspective we want to understand how this development effort takes place.  We know the first step of the design process involved building the terabyte dataset and data mining.  In an interview on YouTube channel Triathlon Tarren a Cervelo rep explained that Cervelo developed the general form of the P5x by taking away all the rules, leaving what was necessary to define a bicycle, and plugging it in to a super computer.  As technologists we can assume this to be broad simplification.  To understand what this process might have looked like we can infer from publications from and

In 2012 the National Center for High Performance Computing (NCHC), part of the Taiwanese National Applied Research Laboratories, set out to develop an aerodynamically optimized bicycle designed by a high-performance computing cluster (HPC).  The HPC cluster was equipped with NVIDIA GPUs, Intel E5 Xeon processors and 8.4TB of total system memory.  Algorithms in Computational Fluid Dynamics (CFD) were coupled with Genetic Algorithm (GA) optimization engines.  Augmented reality provided visualization and interactivity with the simulation in real time while the genetics algorithm essentially provided a Darwinist survival of the fittest model to 7500 simulations over 500 generations.  The ability to perform this kind of interactive simulation and data driven design would not have been possible without HPC and GPU technology.

The effect of high performance computing and data driven design is manifest in the work being done by bicycle manufacturers like Cervelo.  You only must look at places like Global Triathalon Network (GTN) or, God help me, my son’s screensaver to see adoption and resonance of ground breaking and innovative designs like the P5x.  The effect of these data driven technologies, simulation, and analytics radically transforming cycling and triathlon stretch well beyond design and manufacturing.  For even the casual age grouper the accessibility of analytics, simulation, and data driven tools is transformative.  In a later blog we will look to tackle end user analytical tools and simulations.

I would like to acknowledge the following sites for much of the source material for this article:,, the Triathlon Tarren YouTube channel,,, and the Guardian.