Eight insights get to the heart of big data value and point to a future in which we synthesize and make sense of vast data stores.
6. "The same meaning can be expressed in many different ways, and the same expression can express many different meanings."
So say Googlers Alon Halevy, Peter Norvig and Fernando Pereira in their touchstone March 2009 IEEE Intelligent Systems article, "The Unreasonable Effectiveness of Data." How is data's unreasonable effectiveness revealed? Via semantic interpretation of "imprecise and ambiguous" natural languages and by tackling the scientific problem of interpreting massive, aggregated content by inferring relationships via machine learning.
7. "Big data is not about the data! The value in big data [is in] the analytics."
Harvard Prof. Gary King said this, in effect spinning out the Googlers' (see quote number six) thoughts. Yet I can't completely agree with King. There is value in the business process of determining data needs and devising a smart approach to collecting and structuring the data for analysis. Analytics helps you discover that value, so my preferred formulation would be, "the value of Big Data is discovered via analytics."
Simon explains, "Big data has not, at least not yet, replaced intuition; the latter merely complements the former. The relationship between the two is a continuum, not a binary." Tim Leberecht explores this same point in a June article for CNN, "Why Big Data will never beat business intuition."
Finally, these eight points lead to a future truth, an appraisal that I believe isn't yet widely understood:
9. The future of big data is synthesis and sensemaking.
The missing element from most solutions is the ability to integrate information across sources, in situationally appropriate ways, to generate contextually relevant, usable insights. I'll pull some defining quotations from an illuminating paper by design strategist Jon Kolko (admittedly applying them out of context). First, Kolko cites cognitive psychologists who have been studying the connections between problem solving and intuition, who "reference sensemaking as a way of understanding connections between people, places and events that are occurring now or have occurred in the past, in order to anticipate future trajectories and act accordingly."
Kolko sees [design] synthesis as a key element, a "sensemaking process of manipulating, organizing, pruning and filtering data in an effort to produce information and knowledge." What capabilities are afforded? IBM Fellow Jeff Jonas says "general-purpose" sensemaking systems will colocate diverse data in the same data space. Such an approach enables massively scalable, real-time, novel discovery over an ever changing observational space."
Isn't that our big data goal, to advance from pattern detection to actionable conclusions? I hope my nine truths have helped you understand the path.
Seth Grimes is the leading industry analyst covering text analytics and sentiment analysis. He founded Washington-based Alta Plana Corporation, an IT strategy consultancy, in 1997. Follow him on Twitter at @SethGrimes.
Making decisions based on flashy macro trends while ignoring "little data" fundamentals is a recipe for failure. Also in the new, all-digital Blinded By Big Data issue of InformationWeek: How Coke Bottling's CIO manages mobile strategy. (Free registration required.)
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.