New Tools Simplify Power Of HadoopTalend is one of several startups that aim to bring big data analytics to the masses.
One of the most persistent criticisms of Hadoop is that it's too complex for non-techies to use. Despite the software framework's powerful attributes for managing big data operations, it's not for everybody--at least not yet. For instance, businesses analysts who can't write code need help from a third party--often their company's IT department--to extract the data they need.
A number of software vendors are working to change that. One is Talend, a 6-year-old open source software startup. Another is Pentaho Corporation, whose business analytics software was recently chosen by Dell for its Apache Hadoop Solution. And then there's SnapLogic, which calls its SnapReduce technology "Hadoop for humans," and Tableau Software, maker of the Tableau Desktop drag-and-drop analytics tool.
- The Critical Importance of High Performance Data Integration for Big Data Analytics
- Choosing an Approach to Data Integration and Governance in a Big Data World
- Big Data and Smart Trading
- Big Data Analytics: Profiling the Use of Analytical Platforms in User Organizations
Their goal: Bring Hadoop's power to the masses.
[ Want to help spread the word on how big data affects our lives? For more, see Invitation: Join 'Human Face Of Big Data' Project'. ]
"Today Hadoop is an extremely powerful platform," said Yves de Montcheuil, VP of marketing at Talend, in a phone interview with InformationWeek. "But deployments of Hadoop require extremely deep technical expertise to write scripts that allow you to get value out of your big data." Talend "democratizes this situation by allowing users to process and transform big data without deep technical expertise," according to de Montcheuil.
In February, the company launched Talend Open Studio for Big Data, a data integration tool that features a graphical user environment. It's bundled with Hortonworks Data Platform and is compatible with Apache Hadoop distributions. Talend Open Studio for Big Data is a component of the Talend Platform for Big Data, which includes a set of graphical components and workplace for data management. These components automatically generate code for the Hadoop Distributed File System (HDFS), Pig, Hbase, Sqoop and Hive. Users can load and extract information from a big data source without writing code.
"You design your data processing jobs graphically," said de Montcheuil. "The tool takes care of generating the code for you."
He estimates a non-programmer will need to spend "a few hours of training" to become competent with Talend's graphical toolkit. The company provides free online training resources and a user forum. "If you want, you can buy services, training, and tech support from Talend," said de Montcheuil. "But you don't have to. You can work with the community and free resources."
Talend currently has more than 3,500 paying enterprise customers, and about a million users of its free open source version, the company estimates. Its open source business model, unlike those of more established competitors, is designed to reach both small and large organizations.
"On the broad data integration space, there are a number of companies, some of which are very successful, that offer a technology that is extremely expensive," said de Montcheuil. "They offer enterprises great technology, but unless you have a budget that allows you to spend six figures on these tools, you will not be able to afford the products."
The sudden emergence of big data as an industry focus makes it challenging for relatively new players like Talend to forecast the future. "If you had asked me two years ago where I would see our product line going, I would probably not have mentioned big data, because that was just a blip on the radar screen," de Montcheuil said.
"In the short term, we are going to continue to invest a lot into big data. We are not going to stop investing in traditional environments, because a lot of people are still doing very down-to-earth integration projects that involve relational databases," said de Montcheuil, adding, "We will continue to evolve those products. But big data is the main way we are going to see some growth."
"Talend is competent in multiple areas. They're very good in data quality management, master data management, and enterprise integration," 451 Research analyst Carl Lehmann told InformationWeek by phone.
See the future of business technology at Interop New York, Oct. 1-5. It's the best place to learn about next-generation technologies including cloud computing, BYOD, big data, and virtualization.