The Abacus and The Canvas

Luca Cazzanti's blog

The Data Analysis Hype Cycle

Gartner Hype Cycle You may be familiar with Gartner's hype cycle, which illustrates the ups and downs of a new technology as it progressesvalong its journey from inception to maturity, or falls by the wayside. Initially, a technological breakthrough seems promising and gains visibility in the tech community and the popular press. Soon the promise turns into hype, where the expectations that the new technology will suceeed, solve important problems, and apply broadly become inflated. As diligent technologists continue to develop the original breakthrough, its limitations and shortcomings become known. Expectations deflateand the new techology reaches a trough of disillusionment, where the usefulness and impact of the technology are questioned. Succesfull technologies withstand the vetting process and climb out of the trough on an upward slope of enlightment, where capabilities and limitations continue to be refined and better characterized by the experts. Finally, a technology reaches a plateau of productivity: applications transparently embed the new technology, which is widely adopted and, soon enough, taken for granted by the end users.

When I first encountered Gartner's hype cycle, I recognized how, on a micro scale, it can just as well describe the various stages of a data analysis. I call it the data analysis hype cycle, and I have experienced it in just about every data analysis project I've come across as a data scientist or as a data science manager. Perhaps you'll recognize it too; it goes something like this: You are given a problem, and you are able to put together a meaningful dataset and a machine learning model based on the data. Wow, the results look great, you share them with the data science team, then marketing and business development, the whole company is abuzz, surely this will solve your company's predictive challenges, right? Wait, are the results too good to be true? Are there business constraints that limit the applicability of your analysis? Was the dataset biased? Oh no, your neat model and its derived insights are totally useless! You are disillusioned, you must start from scrath, perhaps eat some crow, and forget about that bonus ... hold on, ... look at the results more carefully, there is a pocket of goodness in a subset of the data. Maybe, just maybe, it's not all wasted. You refine your analysis, re-work your model, modify your assumptions, articulate the performance bounds, and yes! it turns out all this work is actually useful in some cases. No, it's not a cure-all, it will not revolutionize your industry, but it is a step forward, and your comapny can productize it for a meaningfully large market.

As a data science manager, how do you help your team members through the data analysis hype cycle? First, I'll say this: going through the hype cyle is good and necessary. It helps vet new ideas and de-risks the downstream business processes that depend on the outcome of the analysis (a machine learning model, a new insight). So, it's not that we should avoid the cycle, it's more about managing its stages. I found the following helpful:

  1. Harness and leverage the early enthusiasm: allow your team members to be excited about their new nifty idea, but ask specific questions early to limit the height of the peak of inflated expectations. Ask about technical details, and ask about relevance to the overall business, (see my bullet further down) but don't kill the enthusiams: often a team member will self-regulate as the data analysis continues. Later, when the team member is struggling in the trough of disillusionment, remind them of the early enthusiasm they felt, and of the potantial impact their work can have.

  2. Create a safe space for fail-and-learn: most new ideas will not survive the entire hype cycle. That's OK. Just make sure the team learns from the failures. Was it a techical detail that escaped scrutiny? Did the anlysis address the wrong higher-level business question? Did the market shift abruptly, making the data analysis irrelevant? Did the insight from the data anaysis turn out to be not that interesting after all? Reassure your team that the lessons learned are important, and that it's OK to fail.

  3. Create a shared knowledge base for the team: I don't mean creating a wiki or writing soul-sucking port-mortems, although wikis can be helpful in general. I mean use the data analysis cycle as a way to create and reinforce a shared team culture, where team members are aware of the various threads of work being tackled by the team, are helful to each other, and develop a common language and shared experiences that will help them tackle ever-more-challenging project. Your data science team meetings should exude that We are all in this together, and we are going through the ups and downs together.

  4. Mentor junior scientists on the data analysis workflow: Raw skill in approaching data analyses wins! Help your team develop these skills, expecially help them develop the ability to ask the right questions about the data. Knowing the right question to ask, how to refine it, and when to change it helps keep the ups and downs of the hype cycle to manageable levels. You may find helpful a blog I wrote on the types of questions for data analysis.

  5. Celebrate those plateaus of productivity: As the data analysis results (an insight, a machine learning model, a processing pipeline) gain traction and reach the product, celebrate in a way commensurate with its impact on the overall business. Recognize the efforts of the team, summarize the lessons learned, and carry the enthusiasm forward to the next challenge.

Happy data analysis!