Healthcare // Analytics
News
10/2/2009
09:03 PM
Connect Directly
Twitter
RSS
E-Mail
50%
50%
Repost This

Cloudera Launches Desktop Interface

Cloudera is trying to make Hadoop, used by Google, Amazon and Yahoo for data analysis, useful as a business intelligence tool.

Cloudera Friday launched a user interface, Cloudera Desktop, to make it easier to implement and use Hadoop applications in the enterprise as a business intelligence tool.

Hadoop is open source code that is a distinct, cloud-oriented technology. It's already in heavy use at Yahoo and Amazon.com, where it runs on large server clusters. Hadoop provides a method of storing a large data set across many disks, then gathering it in a simultaneous, parallel extraction for a pass to analyze key information.

Its power lies in the fact it can do so much faster than standard database operations allow. On a terabyte of data, Hadoop can produce returns in seconds or minutes, as opposed to the many hours of processing that might occur inside a data warehouse.

Hadoop includes a cloud-oriented function, MapReduce, which maps data across a cluster to be analyzed by processors close to where the data is located. This function is part of what reduces the time it takes to process a large data set.

Hadoop use thus far has been the province of heavy hitting computer science PhD's at major Internet companies, such as Google, as it analyzes the content of the Internet. Cloudera is trying to make Hadoop's analysis powers available to the average business analyst.

"We have built Cloudera Desktop to ease Hadoop adoption outside its birthplace," explained Mike Olson, Cloudera CEO, in an interview.

Olson is the former CEO of Sleepycat, supplier of the open source BerkeleyDB embeddable database, now owned by Oracle; he served as VP of embedded databases for two years after the acquisition. Cloudera was founded to become a company that provides technical support to Hadoop users and increases its use.

"Hadoop is a flexible data storage platform. You can do flexible analysis with it," said Jeff Hammerbacher, VP of products, in an interivew. He is the former head of the data team at Facebook, which used the massive amounts of statistics generated on the Facebook site to analyze what users did with the site and what features to produce next.

The Cloudera Desktop aids the task of putting Hadoop to work by supplying four applications. The Desktop's File Browser enables copying and browsing large data files stored on a cluster. Its job submission app, Job Designer, can be used to define a Hadoop job, run it and save it for future reuse. The Job Browser app lets a Hadoop user track the progress of an analysis job. And the Cluster Health dashboard tells the Hadoop user whether all is well with the machine cluster on which Hadoop is running; it can alert system administrators if the cluster is running into a problem.

Roughly equivalent functionality can be obtained through the use of the Apache open source code. Cloudera has moved that functionality from a command line interface to an easier-to-adopt graphical user interface. "We expect it to drive new use of Hadoop," said Hammerbacher.

All four applications run in a user's Web browser, and can run on Windows, Linux or Apple Macintosh machines.

Cloudera received $6 million in a second round of funding from Greylock Partners after receiving $5 million in first round funding. Its individual angel funders include Diane Greene, former CEO of VMware, and Marten Mickos, former CEO of MySQL AB.


InformationWeek and Dr. Dobb's have published an in-depth report on how Web application development is moving to online platforms. Download the report here (registration required).

Comment  | 
Print  | 
More Insights
Big Love for Big Data? The Remedy for Healthcare Quality Improvements
Big Love for Big Data? The Remedy for Healthcare Quality Improvements
Healthcare data is nothing new, but yet, why do healthcare improvements from quantifiable data seem almost rare today? Healthcare administrators have a wealth of data accessible to them but aren't sure how much of that data is usable or even correct.
Register for InformationWeek Newsletters
White Papers
Current Issue
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.