There's no shortage of noise surrounding big data. Today it seems that every software vendor, consulting firm and thought leader has developed the "right" definition of the term. While I'd argue there is no such definition, I would like to dispel a few of the most commonly held myths about the subject, many of which I explore in Too Big to Ignore: The Business Case for Big Data.
Myth 1: You Can Get To All Of The Data
On many levels, we are living in unprecedented times. Never before has so much data been available to us. Forget megabytes and petabytes, exabytes of data now exist. I read recently that the average person in an industrialized society today consumes more information in one day than his fifteenth century counterpart did in his lifetime.
Despite this unfathomable amount of data, no person or organization can store and retrieve all of the data on a particular subject, much less overall. And yes, that includes Google. Its software indexes the Surface Web, not the Deep Web. Some estimates put the latter at 25 times the size of the former. As a result, when you search, you are accessing anywhere from 4% to 6% of all information on the Internet.
Taking it down a level or 30, individual authors like me cannot access some very valuable information, such as which specific customers are buying my books. Sites like Amazon and stores like Barnes & Noble keep that information. Nothing would make me happier than knowing my customers, but even in a big data world, that information eludes me. You will never get all of the data. Period.
Deal with it.
Myth 2: You Need All Of The Data
No doubt that more data helps, but don't for a minute think that you need all data to make an informed business decision. Organizations that are effectively leveraging the power of big data realize that they will never capture all relevant information.
New sources of data spring up seemingly every day, and it's not as if they're all valuable. For instance, email messages often contain extremely valuable insights into the state of an enterprise. Smart companies are mining individual messages to gauge employee sentiment and potentially determine who might be exiting.
This is a far cry from saying that all emails are equally valuable. It's hard to make the argument that using text analytics on spam makes much sense.
You don't need all of the data. Yes, more is better than less, but don't waste time trying to achieve the impossible.
Myth 3: Big Data Yields Certainty
A well-trodden business aphorism is, "I have all of the data I can handle. I just need more information." In Too Big to Ignore, I write about the difficulty of being truly certain about business decisions of any import. It's virtually impossible to be completely sure about a merger, product launch, new venture or even an individual employee hire.
But isn't big data supposed to help us with uncertainty? Yes, but don't confuse reducing uncertainty with eliminating it. That day isn't here yet, and I suspect that it won't arrive anytime soon.
Analyzing petabytes of unstructured data may well help companies better understand customer sentiment. However, don't make the mistake of assuming that big data eliminates all variability. The fluctuations of life and business will still throw a wrench into the best laid plans.
Myth 4: Big Data Is A Temporary Fad
Arguably, the current face of big data is Nate Silver -- or at least it was until he left the New York Times. The blogger and statistician famously predicted that Barack Obama was a 90% favorite to win the 2012 U.S. presidential election, despite the fact that polls put Obama in a virtual dead heat with Mitt Romney. Silver's model was remarkably accurate, and now everyone is asking for his take on everything.
To be sure, the terms big data and data science may vanish into the ether over the next few years. We do like our buzzwords and jargon. Foolish is the professional, however, who believes that data is a fad. I'm sure of very few things, but I know that in 2013 we will collectively generate and consume more data than we did in 2012.
It's high time that organizations recognize the importance of big data. Refuse, and your company may not be around when the light bulb finally goes off.
Phil Simon is a recognized technology expert, speaker and author. His most recent book, Too Big to Ignore: The Business Case for Big Data, continues his discussion on the myths of collecting and analyzing data.
The big data market is not just about technologies and platforms -- it's about creating new opportunities and solving problems. The Big Data Conference provides three days of comprehensive content for business and technology professionals seeking to capitalize on the boom in data volume, variety and velocity. The Big Data Conference happens in Chicago, Oct. 21-23.