How LexisNexis Competes In Hadoop Age - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software
News
9/28/2012
01:03 PM
Connect Directly
Google+
RSS
E-Mail
50%
50%

How LexisNexis Competes In Hadoop Age

Open source HPCC platform evolves from turnkey system to Hadoop competitor.

Hadoop is the certainly biggest name in big data platforms, and often the go-to solution for enterprises seeking a way to manage growing volumes of unstructured data. But LexisNexis, best known as a provider of computer-assisted legal research services, wants the world to know it has an alternative, albeit one that relatively few organizations are using.

HPCC (High Performance Computing Cluster) is an open source platform from LexisNexis Risk Solutions, a division of the company that focuses on big data products and services. LexisNexis, itself a subsidiary of global publishing giant Reed Elsevier, uses HPCC technology for its risk management business, and to gather data it sells to its clients.

"Over the last 10 years or so, we've been selling some of these platforms to customers who came asking for them. But we weren't too proactive in pushing them to market," said Flavio Villanustre, VP of infrastructure for Lexis Nexis Risk Solutions' HPCC Systems. "We thought it was our bread and butter, our core technology, so why sell it?"

During that decade, customers for HPCC turnkey systems included government, intelligence, and law enforcement agencies, as well as financial and risk management firms.

But the sudden emergence of big data led LexisNexis to rethink its strategy.

"Over the last two to three years, we started to see the rise of big data. Before then, it was hard for us to think there was a use for what we had," said Villanustre.

[ Learn about another Hadoop Alternative: Open Source Quantcast Touts Speed. ]

Believing it had the superior platform for managing massive volumes of information, LexisNexis decided a year ago to offer HPCC as open source code. It positioned the platform as a competitor to Hadoop and other big data management systems.

To date, HPCC's industry footprint is still quite small. Villanustre estimates between 50 and 60 organizations use the enterprise edition.

"People can use the open source version, but if they want support, training, or other more advanced modules, they can buy the enterprise license," said Villanustre.

According to LexisNexis, there are several noteworthy differences between HPCC and Hadoop, including HPCC's open-sourced Enterprise Control Language (ECL). For data transformations, ECL's capabilities are similar to those of Pig or Hive. It's a high-level programming language, which in theory means fewer programmers and shorter project-completion times.

HPCC is an integrated system that extends across the entire data lifecycle, including data ingestion, processing, and delivery. It's scalable up to several thousand nodes, and HPCC configurations require fewer nodes to deliver the same processing power as a Hadoop cluster, the company claims. For an in-depth, if partisan, HPCC vs. Hadoop comparison, see this HPCC Systems chart.

HPCC customers today use the platform for a variety of sophisticated, data-intensive applications, including fraud detection and identity verification.

"When it comes to fraud, for example, we have a very good social graph analytics system," Villanustre said. "We can take the social graph of large populations--hundreds of millions of people--and use that information to show (connections) between apparently disconnected potential fraud cases."

The market for big data management platforms is very new. Hadoop may be the best-known solution today, but its shortcomings provide an opportunity for competing platforms.

"It has operational limitations, and you need to resort to a number of extended components to make it work, and to make it reliable," said Villanustre. "We think some of the companies using Hadoop today might go through disillusionment," and perhaps switch to other platforms, including HPCC.

"Hadoop will become more fragmented, pulled across by different commercial players trying to leverage their own solutions," he said.

Villanustre also pointed out that many organizations are finding Hadoop difficult to use. "There's a lack of talent in that area," he added.

In-memory analytics offers subsecond response times and hundreds of thousands of transactions per second. Now falling costs put it in reach of more enterprises. Also in the Analytics Speed Demon special issue of InformationWeek: Louisiana State University hopes to align business and IT more closely through a master's program focused on analytics. (Free registration required.)

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
Reflections on Tech in 2019
James M. Connolly, Editorial Director, InformationWeek and Network Computing,  12/9/2019
Slideshows
What Digital Transformation Is (And Isn't)
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/4/2019
Commentary
Watch Out for New Barriers to Faster Software Development
Lisa Morgan, Freelance Writer,  12/3/2019
White Papers
Register for InformationWeek Newsletters
State of the Cloud
State of the Cloud
Cloud has drastically changed how IT organizations consume and deploy services in the digital age. This research report will delve into public, private and hybrid cloud adoption trends, with a special focus on infrastructure as a service and its role in the enterprise. Find out the challenges organizations are experiencing, and the technologies and strategies they are using to manage and mitigate those challenges today.
Video
Current Issue
The Cloud Gets Ready for the 20's
This IT Trend Report explores how cloud computing is being shaped for the next phase in its maturation. It will help enterprise IT decision makers and business leaders understand some of the key trends reflected emerging cloud concepts and technologies, and in enterprise cloud usage patterns. Get it today!
Slideshows
Flash Poll