Big Data FAQ: Separating Signal From Noise
Author Phil Simon answers the most frequently asked questions on the impact of big data on small businesses and individuals, the role of data scientists and the continuing importance of human intuition.
Many questions about big data have yet to be answered in a vendor-neutral way. With so many definitions, opinions run the gamut. Here I will attempt to cut to the heart of the matter by addressing some key questions I often get from readers, clients and industry analysts.
What is the role of intuition in the era of big data? Have machines and data supplanted the human mind?
Contrary to what some people believe, intuition is as important as ever. When looking at massive, unprecedented datasets, you need someplace to start. In Too Big to Ignore, I argue that intuition is more important than ever precisely because there's so much data now. We are entering an era in which more and more things can be tested.
Big data has not replaced intuition -- at least not yet; the latter merely complements the former. The relationship between the two is a continuum, not a binary.
A key piece of big data is its reliance on "unstructured" and "semi-structured" data. Can you explain what's going on here?
Roughly 80% of the information generated today is of an unstructured variety. Small data is still very important -- e.g., lists of customers, sales, employees and the like. Think Excel spreadsheets and database tables. However, tweets, blog posts, Facebook likes, YouTube videos, pictures and other forms of unstructured data have become too big to ignore.
[ Want to launch a successful big data initiative? Get practical. Read more at Big Data Innovation: Time To Focus. ]
Again, big data here serves as a complement to -- not a substitute for -- small data. When used right, big data can reduce uncertainty, not eliminate it. We can know more about previously unknowable things. We can solve previously vexing problems. And finally, there's the Holy Grail: Big data is helping organizations make better predictions and better business decisions.
Data visualization is becoming more popular than ever. Will dataviz be a requirement for people to be able to understand the insights that big data can deliver?
In my opinion, it is absolutely essential for organizations to embrace interactive data visualization tools. Blame or thank big data for that.
And these tools are amazing. They are helping employees make sense of the never-ending stream of data hitting them faster than ever. Our brains respond much better to visuals than rows on a spreadsheet. Dataviz can help us understand what's going on and ask better questions of the data.
Companies like Amazon, Apple, Facebook, Google, Twitter, Netflix and many others understand the cardinal need to visualize data. And this goes way beyond Excel charts, graphs or even pivot tables. Companies like Tableau Software have allowed non-technical users to create very interactive and imaginative ways to visually represent information.
Data science, some say, is actually a mix of art and science -- the art of knowing what to look at amidst a profusion of information. Can you explain a bit about this? How people can develop those skills?
The data scientist is one of the hottest jobs in the country right now, and probably the world. In a recent report, McKinsey estimated that the U.S. will soon face a shortage of approximately 175,000 data scientists. Demand far exceeds supply, especially given the hype around big data.
However, to become a data scientist one does not necessarily follow a linear path. There are many myths surrounding data scientists. True data scientists possess a wide variety of skills. Most come from backgrounds in statistics, data modeling, computer science and general business. Above all, however, they are a curious lot. They are never really satisfied. They enjoy looking at data and running experiments.
We seem to be entering an era of exponential growth of data. Is there a point at which many enterprise systems will cease to operate?
It's an interesting point, and I discuss it in Chapter 4 of Too Big to Ignore. If we look at the relational databases that organizations have historically used to store and retrieve enterprise information, then you are absolutely right. However, new tools like MapReduce, Hadoop, NoSQL, NewSQL, Amazon Web Services (AWS) and others allow organizations to store much larger data sets. The old boss is not the same as the new boss.
How will big data impact small businesses? Will we see an era where every business (even barbershops or corner stores) will somehow be leveraging big data?
In my book, I write about a few relatively small organizations that have taken advantage of big data. Quantcast is one of them. There's no shortage of myths around big data, and one of the most pernicious is that an organization needs thousands of employees and billions in revenue to take advantage of it. Simply not true.
I don't know in the near future if my electrician or my barber will embrace big data. I certainly have my doubts. (I haven't visited a proper barber in years, but you get my point.) However, we are living in an era of ubiquitous and democratized technology.
Can you talk about how big data will trickle down and impact individuals? Are there direct ways this will impact our day-to-day lives in the coming years?
It's already happening. Big data is affecting our lives in more ways than we can possibly fathom. The recent NSA Prism scandal shed light on the fact that governments are tracking what we're doing. Companies like Amazon, Apple, Facebook, Google, Twitter and others would not be nearly as effective without big data.
I encourage people not to think about big data in an abstract manner. I wrote Too Big to Ignore to emphasize its practical uses and tell some interesting business stories.
As you know, most people don't work in data centers. Rather, it's better for people to know about the companies whose services they use. Are those companies using big data? These days, the answer is probably yes. By extension, then, big data is affecting you whether you know it or not.
In addition, as more and more companies embrace big data, there will be major disruption in the workforce. In the book, I write about how big data will in many instances replace certain jobs. This is always the case with creative destruction.
Are "big data skills" something that everyone will need to learn moving forward? Or will it become simple enough over time that anyone can do it -- much like anyone who knows Microsoft Word can update a website now versus needing to know HTML 15 years ago? What skills do workers need to sharpen to prepare for the era of big data?
I hesitate to say that everyone will need to learn data-related skills. Dataphobes will always exist, for better or worse. (Again, the barber example is a good one.) However, knowledge workers will have to follow, lead or get out of the way. Based upon my research, we have entered a more data-oriented world. Millenials are particularly comfortable with data. They are constantly interacting with technology and data. Wearable technology and the Internet of Things are coming, and soon.
Get used to big data. It really has become too big to ignore.
Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)
About the Author
You May Also Like