Big data analytics have proven their mettle in predicting baseball and election success. How can you make them work for your business?
Big Data Talent War: 7 Ways To Win
(click image for larger view and for slideshow)
Only one number mattered to the data analysis aficionados watching the presidential election results on Tuesday night. That number, 538, is Nate Silver's blog (now under the New York Times auspices). Silver, by using predictive analytics applied against a range of polling and related data, hit a perfect 50 for 50 in his state-by-state predictions.
Remember, this was in a race where the pundits for the losing side were confident in their landslide-win predictions, and pundits on the winning side were predicting a razor-thin victory margin where vote counts and recounts could stretch for weeks. While Silver was spot-on, the pundit stars were overwhelmingly wrong.
Silver is now a media star with sales of his recent book, The Signal and the Noise: Why So Many Predictions Fail -- But Some Don't, up 850%. Does Silver's success mean that comments about aggregated analytics, gamma distributions and sum-of-squares formulas will now become de rigueur on the Washington cocktail circuit? That all those math teachers telling toiling high school students that statistics really will be useful in real life will finally be vindicated? That all the talk about "big data as the next big thing" will actually prove to be more than passing buzz?
Let's talk about that third item. In my opinion, Silver's success is less about big data (which is quickly being overused into meaninglessness) and is more about rigor, innovation and looking outside your business confines to find inspiration. And baseball -- baseball is important also.
[ We're not lacking facts, figures or tools -- so why are we having such trouble wrangling big data? Learn 6 Lies About Big Data. ]
Silver is not secretive about his methodology. You can read it here. (By the way, the Times has a paywall which allows for 10 free articles; if you're worried about hitting it, follow these tips for an end run.) The methodology's seven steps are laid out in detail, but just following them will not make you a political-prediction superstar. Silver provides the recipe, but not the measure of ingredients used in each step.
His weighting to individual polls is part of the secret sauce, or more precisely the end result of the scientific method applied to statistics. That secret sauce is derived through trial, error and adjustment: the same steps scientists have used for years to conduct experiments. Gut checks are replaced by rigor, thinking outside the normal confines and fine tuning. The process is not fast, but measured in years, and supports the concept that computers can aid in prediction, but cannot totally usurp the intelligent, curious human at the keyboard.
I've found one of the best explanations of Silver's methods to be in the Wikipedia entry about his role as a "practical statistician." "The practical statistician first needs a sound understanding of how baseball, poker, elections or other uncertain processes work, what measures are reliable and which not, what scales of aggregation are useful, and then to utilize the statistical tool kit as well as possible."
And that brings me to baseball. Silver's first rise to fame came in 2003 when he developed the PECOTA (player empirical comparison and optimization test algorithm) system for predicting hitters' and pitchers' future outlook. Is baseball the same as politics? Yes and no. Yes, in that there is an old-guard baseball scouting system that relies on gut feel, and a new guard melding statistics and new ways of player measurement. This old guard/new guard division was best laid out in the book and movie Moneyball.
So what do politics and baseball have to do with your business? The goal of all the discussion around big data and data analysis is, as I've argued, not to make the wrong decision faster, but to develop the best decision at the right time and deliver the information to the people that most need the information. In an Information Week column Wednesday, Tony Byrne argued small data beat big data in the presidential election.
Call it business intelligence, data analysis or predictive analytics, IT's role here is to provide a foundation for your company to make the right decisions. Those decisions might be what to charge passengers for seats on a flight, how much to charge to for a season ticket or how many widgets to create to strike the right balance among manufacturing costs, inventory and availability. These decisions are fundamental to business success.
There is no magic to Silver's methods. There is hard work, a willingness to make mistakes and adjust, and a realization that the common wisdom is sometimes not wisdom at all. Innovation can happen in strange places and Silver has shown that the buttoned-down world of statistics doesn't have to be that buttoned down at all.
Here's my advice: Pay attention to Silver's process, but be equally assertive in looking at how your company operates. You will probably find lots of silos of activity where each group tends to use the same measures and methods year after year. Your job is to think outside the box, think like a customer and consider all the influences that would go into a purchasing decision. Understand the influences and you will be on your way to developing a prediction model that actually works for your business.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?