Recent gameday experiment shows that in sports and in business, even the most detailed big data analysis is worthless if the questions don't make sense.
On July 25, a team of U.S.-based Major League Soccer players generated publicity in the United States and shock in Europe after beating Chelsea, a major power in the globally dominant British Premier League. But publicity and shock weren't all that the game produced.
Players from both teams generated thousands of individual data points on their positioning, conditioning, and performance--all collected by the miCoach GPS-based performance monitoring system from Adidas, which they wore during the game.
Gathering statistics on major-league play is nothing new; gathering statistics so detailed no human could ever perceive the actions they measure, let alone accurately record them by hand, is brand new. So is the expectation that the data will be pooled with as much other performance data as possible, then crunched using big data analytic techniques.
The result, theoretically, would give team managers the kind of information they need to assemble an unbeatable team, or train an existing team to exploit weaknesses in an opposing team that the opponent may not even be aware of.
It would be a big data miracle of the kind documented in Moneyball, the 2003 book describing how the Oakland Athletics built a strong roster despite a weak payroll budget by using statistics to identify players who would be unique assets to the team.
There is a good chance the data gleaned from the July 25 victory will make some U.S. teams better.
There's just as good a chance--according to sources with both practical and academic expertise on the topic--that non-statisticians will misunderstand the meaning of the results they see and make decisions reinforcing exactly the kind of performance they're trying to avoid.
Big Data Leads To Big Decisions, But Not Always The Right Ones
As it turns out, European professional soccer leagues have long been heavy users of statistical services from companies including Amisco, Opta, and Prozone--all of which were founded in the mid to late 1990s, long before Moneyball became a thing, according to Matt Aslett, research manager for The 451 Group.
Statistical analysis of players' performance, nutrition, physical condition, and other factors is far more intense than in the U.S., according to a BBC profile of Manchester City, which won the British Premier League championship in 2012 for the first time in 44 years.
Manchester City coaches--or their data-crunching counterparts--know the inner workings of their players so well they are able to concoct recovery drinks and nutritional supplements customized to the results of blood and saliva tests for each player. The supplements are given to players on their return from a hard practice so the team can be certain each player gets the right mixture of biochemical raw materials to repair the damage done by 90 minutes of running hard to capture a ball without touching it with their hands.
Much of the information--about injuries, fitness, and potential therapies, if not the biochemical profiles--is even published as part of the teams' effort to help keep fans up to date with minute changes in their favorite players' conditions.
Among other things, many teams attach GPS and vital-sign monitors to players' attire during practice to collect data on heart rate, stress load, distance covered, rate of acceleration and deceleration, and 100 other bits of data defining some qualities of their play.
Some statistics that seem critical are actually meaningless, however, Aslett wrote, while seemingly irrelevant numbers are crucial.
For example, there is virtually no correlation between the distance a player covers on the field during a game and the outcome. There is also little correlation between the number of tackles, shots on goal, or other specific on-field feats and the score at the end of the game.
On the other hand, according to a Financial Times interview with Manchester City performance analysis chief Gavin Fleig, using statistical analysis to predict where the ball or the other team's players will be can be a huge advantage.
"We would be looking at, 'If a defender cleared the ball from a long throw, where would the ball land? Well, this is the area it most commonly lands. Right, well that's where we'll put our man,'" Fleig told the Financial Times.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.