More businesses are rolling out big data management systems, but are they choosing the right tools to handle the fabled three V's--volume, variety, and velocity--of their data? Brian Gentile, CEO of business intelligence software company Jaspersoft, says many enterprises hurriedly implement a big data platform without exploring all of their options. As a result, they might wind up with a system doesn't meet their needs.
A recent survey conducted by Jaspersoft of its user base shows that 62% of respondents have either already deployed a big data solution, or plan to do so in the next 12 months.
"It was surprising how many of them were already moving down the path aggressively with some sort of big data store and big data application," said Gentile in a phone interview with InformationWeek.
Gentile's 27 years of experience in the software industry, including executive stints at Informatica, Brio Software, and Sun Microsystems, has given him a front row seat to many technology trends, including the recent emergence of big data.
[ Read 10 Big Predictions About Big Data. ]
Recently Gentile has seen many organizations roll out, or at least experiment with, a big data platform, and he has a little friendly advice on how to do it right. Here are three major tasks he believes corporations must nail in order to successfully implement a big data plan.
1. Choose the right big data store. Everyone's heard of Hadoop and assumes it's the best software framework for their big data platform. But one of the most common mistakes is when an organization starts working with Hadoop before fully understanding what technology is best suited for its needs.
"Their best choice may well be one of the other types of software big data stores or frameworks, like from the NoSQL category," said Gentile. For instance, Hadoop is highly scalable, relatively inexpensive, and adept at handling mixed data types, but it's not built for real-time data streaming. "It's all about matching the volume, variety, and velocity of data with the right choice of back-end technology: Hadoop vs. NoSQL vs. analytical data stores. They all have pluses and minuses, and you need to be educated on them in order to make a good choice," he said.
2. Have deep domain knowledge. Your project needs input from technologists, naturally. But don't exclude the non-techies in your organization.
"Successful big data projects bring together business people and technologists perhaps more than any other type of projects we're seeing. It's simply required," said Gentile.
Domain knowledge comes from business people, often analysts who are in charge of knowing what the data is used for, and how quickly it must be put to work, in order to solve the business problem at hand.
Said Gentile, "Domain knowledge is really important, otherwise you'll make a lot of mistakes, or even fail in your big data project by getting it wrong."
3. Apply the right reporting and analysis tools. It's essential to match tools and their capabilities with the business issues that you want to solve.
"A reporting and analysis tool for big data should have three primary characteristics," said Gentile. First, it must provide native, intelligent access to all necessary data sources, including big and traditional data. "We have to assume nowadays that you're going to want to combine and co-mingle the analysis of non-traditional big data types with traditional, relational data types."
Second, an organization must deploy a scalable, modern architecture that provides the necessary expansion as its data volume expands.
"As the volumes of data grow geometrically, sometimes the front-end reporting and analysis tools will break if they don't have a modern architecture that allows them to scale out and take advantage of additional resources affordably," said Gentile.
Third, the tools you use must be adept at handling the latency, or velocity, of your big data project.
If your data is arriving in large volumes in real time--in seconds or sub-seconds--perhaps from machine sensors or a high-volume website, you need to process it very quickly in order to make use of it.
It's essential that you understand the business problem you're trying to solve, and how quickly you need to use the data in order to solve it. "Without understanding that, it's just luck if you get the project right, said Gentile.
See the future of business technology at Interop New York, Oct. 1-5. It's the best place to learn about next-generation technologies including cloud computing, BYOD, big data, and virtualization. Register by Friday, Sept. 28, to save 40% off on Interop New York Conference Passes with code WEYLBQNY09.