Big data can be a powerful predictor.
Businesses use it to identify opportunities for growth in new markets. Analysts have used big data to predict everything from Oscar winners to the outcome of the recent presidential election.
Data analysts were equally confident that Leonardo DiCaprio would win the Academy Award for Best Actor last year, as they were that Hillary Clinton would win the 2016 election.
Well, they were half right. On election day, 538 gave Trump the most generous possibility of winning the Electoral College (28.6%); while other data analysts gave him as little as a 2% chance. By now, you know that he managed to collect 290 electoral votes, 20 more than he needed to win the election.
Why were the presidential winner predictions so wrong?
Most data analysts will tell you it wasn’t big data that was flawed, but the analysis.
Every day, data scientists gather hundreds of thousands of data points. They study digital conversations to gauge the tilt of online discussions.
The data itself should not be seen as a predictor, but rather the knowledge that fuels artificial intelligence (AI).
A New York Times article referred to data science as “a technology with trade-offs. It can see things as never before but also can be a blunt instrument, missing context and nuance.”
Professor Erik Brynjolfsson, Director of the MIT Initiative on the Digital Economy, said, “the key thing to understand is that data science is a tool that is not necessarily going to give you answers, but probabilities.”
However, even the “probability” of Trump winning seemed far from accurate.
The New York Times says that the failed predictions of the 2016 election outcome suggest, “the rush to exploit data may have outstripped the ability to recognize its limits.”
Data scientists point to flaws in polling as a big contributor to the inaccuracy of the prediction. In spite of the historical data gathered from decades of polls, this particular election was different. A large percentage of voters remained “undecided” all the way to the moment they cast their ballot. While they might have voiced possible support of one candidate, their choice shifted in the privacy of the voting booth. How can a poll accurately gauge the depth of indecision?
Big data is an ingredient, not the end result.
Human error—incorrect assumptions and flawed interpretations of data—is more likely the major factor in recognizing that data is an ingredient, not the end-result.
“Tuesday was not a failure of data; it was a failure of forecasting and analysis by humans,” explained Aaron Timms, the Direct of Content at Predata, a New York-based predictive analytics firm.
“The data was as good as it could be, but the analysis of it lacked depth.”
The data mining errors we just witnessed exposed the need to dig more deeply into both the power and limitations of this “science.”
Just as Thomas Edison failed 10,000 times before successfully inventing the light bulb, we must learn from mistakes—and expect to make more—in order to see the future through the lens of big data.
Data is one piece of the big picture and when analyzed and used properly, tells a valuable part of the story for businesses and politics.
What are your thoughts on how to properly analyze big data? Share with us on Twitter @LTronCorp