How To Make Datathons and Hackathons Sustainable
What’s a “datathon”? Here’s one suggested definition from NYU’s Institute for Public Knowledge:
Modeled after hackathons, datathons are 24-hour workshops in which small groups of social scientists collaborate intensely on a problem defined ahead-of-time by the host committee. Hackathons originally stemmed from the collaborative world of open-source projects and have highlighted the value of intensive collaborative activity. Likewise datathons bring together social scientists with common interests to work towards a short culminating presentation at the end of the 24-hour period.
A good problem for a datathon is one for which there is data available - either through public datasets or from collaborators’ own research. The general framework is fairly flexible and the purpose is to give social science experts a space and time to work enthusiastically and collaboratively to produce a short piece of writing, a data visualization, or another deliverable of their choosing in a short period of time.
A basic element of datathon sustainability is keeping the effort going after the event. Listen to the interview with Mike Reich of Seabourne Inc. conducted by Travis Korte of the Center for Data Innovation. This is what Mike says at 6:55 into the interview:
Everyone’s been to a hackathon where they left feeling very excited and engaged and then it sort of never went anywhere.
The question is, come Monday, what happens when the pumped-up participants go back to their day jobs? As Reich says, a lot of the energy can get lost. While it’s easy to ask the question, “How do we keep the energy going?” let’s dig a little deeper.
I saw first hand how to make the “energy” sustainable through participation in a datathon at the World Bank as described in Learning from the World Bank’s “Big Data” Exploration Weekend.
Not surprisingly, preparation is key.
Analogy to large projects
One way I think about this is to compare the situation with a large, complex project plan that incorporates both well defined, repeatable, and measurable activities along with more creative or loosely defined development components. Sometimes the outputs from these development components have high potential value but they also carry with them a high potential for failure. This mix of activities may require a different or more collaborative management approach than the overall project and its more formal mix of deadlines, deliverables, and stakeholders. In this comparison the overall project plan describes the involvement of the sponsoring organization while the datathon weekend represents the more flexible and harder-to-pin down development activity.
Managing them together requires a balanced management approach. For a large organization, sponsorship of a weekend datathon carries with it both potential value as well as potential risks. The business needs, for example, to explain how and why the data are collected is really important. On the other hand, you can’t be too rigid with the developers/analysts since doing so may choke off meaningful avenues of discovery.
This is why it’s important, I think, to have business sponsors as part of a datathon’s analysis teams. They can quickly provide an understanding of why the data being “hacked” are the way they are and what the business is looking for.
We also want to make sure there is a general understanding of how the datathon’s work fits into the overall strategies of the sponsoring organization, so we’re concerned about promoting sustainability of these datathon activities from two perspectives:
- What can be done before the datathon to help ensure there will be follow-on to high-value weekend activities?
- What can be done after the datathon to promote adoption of the good things that come out of the weekend?
In the case of the weekend activity at the World Bank I wrote about above, one thing that helped a great deal was the involvement of the organizer DataKind. They helped organize the weekend’s activities and helped coordinate a collaborative infrastructure for the weekend so that data, documentation, visualizations, and analyses could be created and shared rapidly online and in real time by the various working groups that were formed around a series of real world problems and real world data.
This shared management approach worked well. Real problems were defined in advance to accompany real data sets, the weekend’s data hacking proceeded in a spirited fashion, and the sponsoring organizations came away with core analyses, models, visualizations, and findings that could be worked on further — even though most of the weekend participants “went back to their day jobs” and in many cases “lost touch” with most of those other participants. The foundation for “sustainability” had been laid and was implemented in follow up work.
In my own case I already knew some of the participants of the process having consulted previously with the World Bank; for me the datathon was another opportunity to maintain my relationships. And I met some new people with whom I’m still in contact.
This was, I think, a good lesson in “sustainability” and represents one way to derive maximum value from an intensive data analysis effort that brings together participants on a temporary but high energy basis. Preparation before and preparation for what happens afterwards are key, as is the focus on real topic areas with real data and real value.
Copyright (c) 2013 by Dennis D. McDonald