Cloudera Intros Hadoop Monitoring, Self-Provisioning Tool

Cloudera Director tool leads off lineup of announcements, including upgrades to Impala SQL query engine and stronger security.

Ellis Booker, Technology Journalist

October 14, 2014

3 Min Read
Cloudera made its announcements on the eve of Strata+Hadoop World, where attendance will double compared to the 2013 conference, shown here.<br />(Source: <a href="" target="_blank">O'Reilly Media</a>)

EMP, Debunked: The Jolt That Could Fry The Cloud

EMP, Debunked: The Jolt That Could Fry The Cloud

EMP, Debunked: The Jolt That Could Fry The Cloud (Click image for larger view and slideshow.)

Leading Apache Hadoop provider Cloudera made a slew of announcements Tuesday, starting with the release of a new, free tool for monitoring and self-service provisioning of Hadoop clusters in the cloud.

Called Cloudera Director, the tool allows business users to provision and monitor private or public cloud deployments of Hadoop, reportedly without needing IT staff intervention. If it works as advertised, Director will make scalable Hadoop cloud deployments far easier to spin up. The application also includes usage tracking for determining costs and departmental charge-backs.

Amazon Web Services (AWS) is the first cloud vendor to be certified with Cloudera Director, followed by SoftLayer, CenturyLink, and T Systems; other cloud providers are in the works.

The new tool represents "over a year of talking to customers" about their needs for data integration, governance, and security in the cloud, Matt Brandwein, Cloudera's director of product marketing, told InformationWeek in a phone call.

[Hadoop? A high-scale relational database? NoSQL? Event-processing technology? Here's how to decide. Big Data: How To Pick Your Platform.]

While most Cloudera customers are using on-premises Hadoop today, the company has seen a lot of experimentation in the cloud. "We want to get ahead of that... We want to have their favorite Hadoop experience waiting for them in the cloud," Brandwein said.

While Cloudera Director is free to download and use, the company expects production deployments of cloud Hadoop (either entirely in the cloud or a hybrid of on-premises and cloud) will step up to a paid Cloudera subscription, which includes an unlimited number of Director seats.

Impala 2.0
Since its release 18 months ago, Cloudera's Impala, an SQL query engine for Hadoop, has been downloaded more than a million times. The 2.0 release beefs up support for core SQL functions, vendor-specific SQL extensions, and legacy data types.

Also new in Impala 2.0 is the removal of query-size limits. The database now supports queries against physical disks, and so is no longer dependent on RAM size, as was the case in the previous version.

Security improvements
Rounding out the news, Cloudera said it had significantly beefed up the security attributes of Cloudera 5.2. While healthcare, financial services, and government organizations are showing increasing interest in Hadoop databases, they often can't move on to production deployments due to a lack of industry-standard security features.

"Security has been a major area of investment for well over a year now, and we've had the opportunity to speak with hundreds of customers," Brandwein said, adding that without these security features, Hadoop deployments will be forever be relegated to a side role as cut-off "sandboxes."

To address these shortcomings, Cloudera 5.2 brings the following:

  • Perimeter security based on Kerberos network authentication

  • Role-based access controls for both SQL and non-SQL resources

  • Hardware-level security for Intel chips (leveraging the chipmaker's $740 million, 18% investment in Cloudera in March)

  • Support for auditing and data-lineage tracking

  • Comprehensive encryption, including centralized management of encryption keys (leveraging Cloudera's acquisition of Gazzang in June)

Cloudera made its announcements a day ahead of Strata+Hadoop World's opening in New York. The three-day conference, co-presented by O'Reilly Media and Cloudera, is sold out, with 5,000 attendees expected, nearly double the size of the last year's New York show, according to Cloudera.

What will you use for your big data platform? A high-scale relational database? NoSQL database? Hadoop? Event-processing technology? One size doesn't fit all. Here's how to decide. Get the new Pick Your Platform For Big Data issue of InformationWeek Tech Digest today. (Free registration required.)

About the Author(s)

Ellis Booker

Technology Journalist

Ellis Booker has held senior editorial posts at a number of A-list IT publications, including UBM's InternetWeek, Mecklermedia's Web Week, and IDG's Computerworld. At Computerworld, he led Internet and electronic commerce coverage in the early days of the web and was responsible for creating its weekly Internet Page. Most recently, he was editor-in-chief of Crain Communication Inc.’s BtoB, the only magazine devoted to covering the intersection of business strategy and business marketing. He ran BtoB, as well as its sister title Media Business, for a decade. He is based in Evanston, Ill.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights