Cloudera Addresses Hadoop Analytics Gap
Cloudera is the #1 provider of Hadoop software, training and commercial support. From this position of strength, Cloudera has sought to advance the manageability, reliability and usability of the platform.
During 2012, the discussion turned from convincing the broad corporate market that Hadoop is a viable platform to convincing people that they can gain value from the masses of data on a cluster. But to do that, we'll need to get past one of Hadoop's biggest flaw: the slow, batch-oriented nature of MapReduce processing. Tackling the problem head on, Cloudera has introduced Impala, an interactive-speed SQL query engine that runs on the existing Hadoop infrastructure. Two years in development and now in beta, Impala promises to make all the data in the Hadoop Distributed File System (HDFS) and Apache HBase database tables accessible for real-time querying. Unlike Apache Hive, which offers a degree of SQL querying of Hadoop, Impala is not dependent on MapReduce processing, so it should be much faster.
There's a lot riding on Impala. What's not yet clear is whether it will mostly work with conventional relational tools or whether it will cut many of them out of the picture. Thus, all eyes will be on Cloudera in 2013.