Big Data: Informatica Tackles The High-Velocity Problem

The software upgrade addresses data volume, velocity, and variety, and also adds connectivity to social network feeds from Facebook, Twitter and LinkedIn, and read-write access to the Hadoop Distributed File System.
8 Big Data Deployments In Detail
(click image for larger view)
Slideshow: 8 Big Data Deployments In Detail
Big data is not just a matter of scale, it's also about data variety and data velocity. It presents a combination of challenges that Informatica announced Monday it has uniquely addressed with the 9.1 release of its data-integration platform.

It might seem late to be jumping on the big-data bandwagon. Indeed, there's no single element of the Informatica 9.1 release that is surprising or entirely new. Nonetheless, it's the sweep of data-management and data-integration capabilities that will appeal to architects, analysts, and enterprises interested in a comprehensive platform and management approach.

To address big-data scale, Informatica 9.1 offers near-universal connectivity to transactional and analytical databases, according to the company, and that list includes the high-scale analytical appliances offered by the likes of EMC Greenplum, HP Vertica, IBM Netezza, and Teradata. These links were there in Informatica 9.0, and you'd expect them in any integration suite worth its salt. But Informatica's independence and focus on integration helps it stay current with myriad data sources.

Informatica addresses high-velocity sources by way of its 29 West (ultra high-speed messaging) and Agent Logic (complex event processing) acquisitions. The related technologies address messaging and monitoring workloads that generate multiple terabytes of data per day, with requirements for sub-second analysis. Here, too, support was there these systems and similar third-party technologies in the 9.0 release.

What entirely new in 9.1 is connectivity to social network feeds from Facebook, Twitter and LinkedIn, and read-write access to the Hadoop Distributed File System (HDFS). These "big interaction" and "big data processing" sources, as Informatica describes them, are examples of the diverse data types emerging as part of the big-data story.

By crossing analyses of social network data with transactional information, for example, companies are developing a better understanding of customers and their satisfaction with products and services. And by turning to low-cost processing and storage platforms like Hadoop, companies are cost-effectively handling large-scale transactional information and new data types, including log files and other machine-generated data.

There's nothing new about tapping into social network feeds--sentiment analysis applications have been doing that for at least a couple of years. And data warehouse appliances invariably have connectors for Hadoop. Informatica's advantage is in offering a single, all-points integration environment and approach rather than forcing developers to tackle a series of point-to-point integrations, each with its own technical challenges.

"Point-to-point connectivity is easy, but if you're trying to manage integration as an enterprise, you'd want one way to standardize," Informatica chief technology officer James Markarian told Information Week.

Among Informatica's more than 4,300 customer firms, about 10% are growing into the big-data realm exceeding 100 terabytes, Markarian estimated. Financial services companies are leading the way, he said, but healthcare, logistics, manufacturing, data services bureaus, and even casinos are scaling up.

Beyond the big-data improvements in the 9.1 release, Informatica has upgraded and extended self-service and Web data-services capabilities that started in release 9.0. Self-service is about empowering users to handle data-management tasks themselves, without coding or IT help.

Where 9.0 introduced self-service options for handling data-quality and data-stewardship tasks, 9.1 adds tools and interfaces for data-movement. Thus, users can now search, profile, filter, and applying data-cleansing and data-quality rules to data before moving it wherever it's needed, according to Informatica.

"Customers tell us self-service capabilities have doubled the productivity of their data analysts," Markarian said.

Informatica says it has completed the Web-services enablement introduced in the 9.0 release. Where that release brokered data via SQL, the 9.1 release brokers data over SQL or using Web services for data reading or data writing. This gives developers and data analysts more options for blending myriad data sets and sources, and delivering that information wherever it's needed.

Considering the big-data and Web-services capabilities together, you could provide SQL access on top of Hadoop while also using data services to join Hadoop data with mainframe data, Oracle data, messaging data or anything else that Informatica touches.

"You can't get that from Hadoop or any other data-services technology that lacks the reach of our platform," Markarian said.

There have been lots of big-data-related data-integration and data management announcements in recent months, but many are one-trick ponies--a connector to Hadoop, a plug-in data sorter, a parallel-processing engine. And in several cases the products have been announced, but they won't be available until later this year.

Informatica 9.1 is set to ship by the end of this month, and the upgrade handles big data in a broad integration and data-management context.

IT teams areas are packing more information on fewer devices, delivering faster throughput while using less space and power, and managing the needs of more applications with fewer people. Our new report shows how smart CIOs will accelerate this trend by adopting new multipurpose arrays and converged networks. Download our report here. (Free registration required.)

Editor's Choice
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing
John Edwards, Technology Journalist & Author
Shane Snider, Senior Writer, InformationWeek
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing
Brandon Taylor, Digital Editorial Program Manager
Jessica Davis, Senior Editor
John Edwards, Technology Journalist & Author