Feature Story

How to Hold a Data Science Competition

Zillow and other companies use contests to solve knotty business problems and connect with new talent.

The “Zestimate” home-value estimation algorithm is one of Zillow’s most prized assets. Over the past 12 years, the online real estate listing firm has used it to slog through billions of data points ranging from the number of bathrooms to property tax estimates to calculate the estimated values of more than 110 million U.S. homes.

In that time, Zillow says, it has steadily improved the Zestimate’s median error rate. Now more than half the homes in a given area fall within 5 percent of the Zestimate. But Zillow wants to do even better.

Last year the company turned to the wisdom of crowds for help. Using a data science competition platform hosted by Kaggle, Zillow offered a $1 million prize to a person or team that could most improve Zestimate precision, based upon a set of data provided by the company. The first round attracted 76,000 entries from 4,400 competitors. The top 100 contestants have now moved to the second round, which ends in 2019.

Zillow chose to submit the problem to a community, in part, to gain new perspectives. It wasn’t disappointed.

“I expected that the top teams would arrive at the same answers, but we found that the third-place team’s solution was totally un-correlated with the others,” said Andy Martin, a senior manager in Zillow’s Data Science & Engineering group. “That was a surprise.” In fact, real estate professionals are not heavily represented among the 100 finalists.

Putting proprietary data out to the wide world might be a foreign concept to some executives, but Zillow’s experience shows two enticing benefits: improving performance in key business areas and connecting with new talent in the competitive data science space.

Crowdsourced Wisdom

Gaining diversity of opinion is just one reason hundreds of organizations now regularly conduct data science competitions. An estimated 1.5 million people regularly participate in contests hosted by organizers like Kaggle, TopCoder, CrowdAnalytix and Driven Data on behalf of the world’s largest companies.

Data science problems are uniquely suited to crowdsourcing because “there’s usually no one right answer,” said Divyabh Mishra, CEO of CrowdAnalytix, a crowdsourced analytics service focused on life sciences and professional services firms. “It’s good to have multiple independent opinions.”Data science problems are uniquely suited to crowdsourcing because “there’s usually no one right answer.”The varied contestant backgrounds that Zillow saw aren’t unusual. “We almost always find that the people who win our contests have nothing to do with the industry that sponsors them,” said Mike Morris, CEO of TopCoder, which has sponsored competitions for more than 15 years.

With salaries for data science professionals averaging $120,000, contests can be a cost-effective way for resource-limited organizations to access top talent. They can also open up new ideas for using data an organization already has.

“We use competitions not only to surface solutions, but to understand opportunities and to connect internal problem owners with resources and networks that can help,” said Trevor Monroe, a program officer with the Innovation Labs at the World Bank, which has regularly conducted competitions since 2014.

The nature of the problems the bank addresses – such as finding new ways to make the world’s climate more resilient – lend themselves to interdisciplinary cooperation and new perspectives, Monroe said. “The best talent is beyond your organizational walls.”

Many contestants aren’t even professional data scientists. They find the competitions to be a useful way to hone their skills, find employers or clients or just have fun.

“I meet awesome people with amazing skills from all over the world,” said Wladimir Leite, who has won 38 TopCoder data science matches since 2003. In addition to stimulating his problem-solving skills, competitions have helped Leite expand his horizons beyond his day job in computer forensics. Contests are “great opportunities to get in touch with new things,” he says. “Competitors with the best results share their ideas in a post-match discussion, and that can be very interesting for everyone who took part.”

How to Host Your Own

If you think crowdsourced data science might be for you, get ready to do some homework. Hosting a competition isn’t as simple as posting a one-paragraph description of a problem on a bulletin board. Experts say the quality of solution relates directly to the rigor with which the challenge is defined. “Our matches typically last two to three weeks, but the prep work takes about a month,” said TopCoder’s Morris.

Not every problem lends itself well to a competitive format. “It can’t be one you can solve with a basic linear regression,” said Greg Lipstein, co-founder of Driven Data. “It has to invite multiple approaches.”

Choose clear goals and a well-defined data set. Variable data should be clearly labeled and cleaned of errors and inconsistencies. It’s okay to anonymize data and labels, but there should be no question about how variables relate to the desired solution. “Contestants aren’t going to take raw data and work with it. That never works,” said CrowdAnalytix’s Mishra.

Limit the scope of the problem as much as possible. For example, a poor choice for a challenge would be “Determine why retail sales are slow in February,” because of the number of possible variables. A better problem statement is “Correlate weather conditions in February over the past five years with a 10 percent or more decline in retail sales.” If using a historical problem for which results are known, withhold the outcome. Or challenge contestants to use historical data sets to predict future outcomes.

Either way, don’t confuse contestants with too much information, says CrowdAnalytix’s Mishra. “The goal of a contest is to find patterns in the data. They don’t need to know what the problems represent,” he says.

Screening entries quickly is essential, so have algorithms in place to evaluate submissions before flinging open the doors. The evaluation model should consist of independent variables and your own calculated outcome that sets a baseline for comparison. Mishra recommends the root-mean-square deviation to measure output quality.

Status and professional recognition are as important as money in rewarding performance, experts say. A leader board is a form of gamification that gives competitors a continuously updated measure of their performance. Profiles include lifetime metrics, competitions won, badges and professional background information. “People continually tell us they’re here because they love competing,” said TopCoder’s Morris.

Finding a pool of contestants can be challenging. When the founders of Driven Data were getting started, they used their university network to spread the word. Many graduate programs are looking for real-world problems to solve, and there are online forums where data scientists congregate. “It was a lot of word-of-mouth,” Lipstein remembered.

Professional competition organizers are costlier than doing it yourself, but they have the communities already in place.

Once you launch your problem out into the community, keep an open mind. The best answers may come from places you never even considered.