Cloudera Boosts Hadoop Portfolio With Security, Data Update Offerings - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Software Platforms

Cloudera Boosts Hadoop Portfolio With Security, Data Update Offerings

Cloudera is filling the gaps in its Hadoop portfolio with two new products. RecordService provides security management across multiple Hadoop data access apps, while Kudu combines fast analytics and data updates, slims workloads.

8 Smart Cities: A Peek At Our Connected Future
8 Smart Cities: A Peek At Our Connected Future
(Click image for larger view and slideshow.)

No longer just a place to keep big data, Hadoop is growing into a dynamic platform. Now, Cloudera is looking to keep it growing by provided two new pieces that address security and data updates.

On Monday, Sept. 28, Cloudera unveiled RecordService, which allows for singular security management across multiple Hadoop data access apps. In addition, the company detailed a second product called Kudu, which helps combine fast analytics and data updates.

Kudu and RecordService are currently in beta. They are being offered for free as open source apps, and are to be donated to the Apache Software Foundation eventually.

Kudu is a high-speed storage engine that bridges HBase (an open source, non-relational database) and HDFS (Hadoop Distributed File System). "Kudu is the culmination of a three-year R&D effort," said Matt Brandwein, director of product marketing at Cloudera.

(Image: Danil Melekhin/iStockphoto)

(Image: Danil Melekhin/iStockphoto)

Without Kudu, HBase and HDFS are hobbled by limitations. HDFS cannot change data once it is written, though it can append data to files. Updating means deleting and re-adding the files, Brandwein said. HBase is designed for rapid updating, but "it's not good for analytics."

Kudu "enables the combination of updating and analytics," he said. It also simplifies Hadoop architecture by reducing two workloads down to one, while still keeping the strengths of HDFS (storage) and HBase (building online applications). Bridging these two will permit the construction of a real-time online dashboard.

RecordService provides consistent security management across different data access apps, like Spark, Hive, and Impala. The challenge is that each has its own set of security guarantees when used without RecordService. Impala and Hive require control of "fine-grained data," while Spark gets by on coarser data security over rows and columns, Brandwein explained.

To solve this challenge, RecordService "sits between storage in Hadoop and accesses all engines in Hadoop." It brokers data requests, looking up permissions in Apache Sentry and presenting only the data the user is allowed to see. "In effect, it brings universal access control and enforcement to the system."

As a result, there are no loopholes a person could exploit by switching from one form of search to another. Each must follow the same pathway, passing through RecordService's filter.

[Learn more about what Cloudera is doing to advance Hadoop. See Cloudera Sees Spark Emerging As Hadoop Engine.]

Hadoop customers want to store and analyze data on one platform, and use one architecture instead of different architectures on different servers. Completing that singular platform is the challenge. "The pieces are there," Brandwein said. It is more a question of Hadoop reaching maturity, where those pieces are all in their proper place, working together.

"Hadoop is rapidly completing. I don't think we are there yet," he said. "The vision is not Hadoop being another database. We are reinventing how analytics are done."

Hadoop began life as a way to store and process big data.

Its most common use is ETL (extract, transform, and load), according to a recent study done by AtScale. Now the goal is to provide "an end-to-end analysis chain," collecting data in one place and working with it in multiple ways, Brandwein said.

William Terdoslavich is an experienced writer with a working understanding of business, information technology, airlines, politics, government, and history, having worked at Mobile Computing & Communications, Computer Reseller News, Tour and Travel News, and Computer Systems ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

10 Things Your Artificial Intelligence Initiative Needs to Succeed
Lisa Morgan, Freelance Writer,  4/20/2021
Tech Spending Climbs as Digital Business Initiatives Grow
Jessica Davis, Senior Editor, Enterprise Apps,  4/22/2021
Optimizing the CIO and CFO Relationship
Mary E. Shacklett, Technology commentator and President of Transworld Data,  4/13/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll