Planning a big data strategy? Don't be overly ambitious and always know the problems you're trying to solve.
Been reading up on big data? Maybe wondering if it's time to jump on the ol' Hadoop bandwagon, hire a data scientist or two, start stockpiling petabytes of information, and then... Then... what?
If you're building a big-data platform, but aren't sure why, you're asking for trouble. So says Joel Young, chief technical officer of Digi International, a provider of machine-to-machine (M2M) wireless sensors and cloud solutions for a variety of enterprises in energy, medical, transportation, and other industries.
In a phone interview with InformationWeek, Young says organizations face two key challenges when undertaking a big data project. Number one: Not having a clear understanding of what you're trying to accomplish.
"You end up with a boil-the-ocean strategy," says Young. He cited people who say to him, "'You know what? I've got to get involved in this Internet of Things. We're going to start collecting data because data is powerful and I'm sure we're going to gain a lot of insights from it.'"
When he hears that, "it's like a red flag," he says. "It's like, okay, let's back up here. What is the biggest problem you have? Why do you want to collect all this data? What kind of insight are you looking for? Just saying 'insight' and 'innovation' is a wonderful thing, but first and foremost you need to focus."
Oh, if you're not familiar with the term "boil the ocean," here's a brief definition from the Urban Dictionary: "To attempt something that is way too ambitious, effectively impossible -- an idea too broad in scope to accomplish." Like a big-data project without a clear objective.
"Finding the right data, if you know what it is, is way easier than just collecting monster loads of data and then trying to figure out what to do with it," says Young.
The second challenge is when enterprises assume they'll find a brilliant data guru -- the mythical data scientist unicorn -- who'll possess expert knowledge in a variety of technical and business disciplines required to make their big-data strategy fly.
"If you're connecting things that haven't been connected before, you need to be an expert in everything from embedded design, to cloud computing, to big data, to Web services, to application development," he says. "And there really isn't any one person who would be good at all of those things. And if they say they are, I think they're fibbing."
Rather, an effective big data solution often requires an in-house data science team or outside consultants to sharpen the focus. The data-driven Internet of Things (IoT), where billions of wireless devices communicate seamlessly on a global scale, faces technological challenges as well. For instance, today's menu of wireless solutions, including WiFi, Bluetooth, ZigBee, and various forms of cellular including LTE, is far from elegant and seamless.
"Over time there's a natural pruning... as certain technologies mature, hit their stride, and become more pervasive," Young says.
Security is another area that needs attention. To avoid breaches like the recent IoT-based cyber attack in which spam email was sent via consumer devices, connected hardware should be designed with all interfaces shut off except for the one needed to exchange data.
You should "make sure that interface is monitored and controlled," he says. "If you do that, it's not very hard to design security into the system."
Despite a growing interest in big data projects, many enterprises are still in the testing phase. "Opportunities are growing rapidly, but large-scale rollouts are a bit slower," he says. "There's a lot of activity and so many [companies] are trying to do proof of concepts."
You can use distributed databases without putting your company's crown jewels at risk. Here's how. Also in the Data Scatter issue of InformationWeek: A wild-card team member with a different skill set can help provide an outside perspective that might turn big data into business innovation. (Free registration required.)
Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, The Saturday Evening Post, and InformationWeek. View Full Bio
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.