Facebook's Jay Parikh talks about fixing Hive, real-time platforms and how traditional companies can 'thread the needle' of big data success.
13 Big Data Vendors To Watch In 2013
(click image for larger view and for slideshow)
IW: Are there graph-analysis possibilities for ordinary companies and is the technology very mature?
Parikh: Graphs are not new, but there are definitely more technologies available now in terms of commercial and open-source graph databases. It's yet another cool piece of technology that lets you derive insight, but it's not going to supplant enterprise applications like fraud-detection or e-commerce that already highly optimized on relational databases.
The ecosystem around graph technology is very under-developed, and I don't think it will ever become as developed as the relational world because it's not general purpose. Graphs will develop, but it's going to be just yet another piece of technology that lets companies carve off and optimize a few key applications.
IW: Do you have any advice for enterprise IT shops venturing into big data?
Parikh: You're going to have the big-data Hadoop-Hive world, and then you're going to have some specialized real-time systems and you're going to have some specialized graph processing engines. Most IT shops, if they're good and they have a lot of applications to deal with, are going to end up in this world.
Everybody is dealing with scale today, and it's getting to be a more difficult challenge in terms of the amount of data that people want to collect and analyze. Sometimes companies are collecting data and they don't know what to do with it yet, or they're collecting data that they don't even know they have. The fundamental problems are how do you store it, how do you process it and how do you derive useful insights? If you aren't careful as you build out big data applications, you stand to waste a lot of money or you stand to miss huge opportunities in your business. Threading that needle is what every tech company in the world has to do, and most companies won't be able to do it well.
IW: Why not?
Parikh: It's very hard to manage the balance between storing too much and then trying to find something valuable or partitioning your data among different business units and not being able to get insight across the business. We're in an early phase of this technology. It's not something that's insurmountable and people are figuring it out. But storing the data, determining what you do with it, writing the applications and responding to the insight from the data is the balancing act that every tech organization is going to work on.
IW: The "wasting a lot of money" danger is pretty clear -- too much data, too little value. Any advice on how not to miss the opportunity?
Parikh: It's crucial to understand the data that you're collecting and to react to it to change your business. If you're just focused on the tip of the data, you may be missing a longer-term trend. You might be fixated on just a couple of bits of data and not looking at other bits that might be significant. You need a micro, laser focus on impact, but you also need to have a broad perspective on where you're going with all the data.
You may be focused on decisions with real-time data, but are you missing a longer-term impact on your business if you're not looking at your entire data set? It takes a lot of iteration and experimentation to succeed. It's an exciting time and there are lots of cool things for enterprises to try, but it's hard work and the technologies are still maturing.
The Enterprise Connect conference program covers the full range of platforms, services and applications that comprise modern communications and collaboration systems. Hear case studies from senior enterprise executives, as well as from the leaders of major industry players like Cisco, Microsoft, Avaya, Google and more. Register for Enterprise Connect 2013 today with code IWKPREM to save $200 off a conference pass or get a free Expo Pass. It happens March 18-21 in Orlando, Fla.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
InformationWeek Must Reads Oct. 21, 2014InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.