13 Big Data Vendors To Watch In 2013
December 11, 2012 09:06 AM From Amazon to Splunk, here's a look at the big data innovators that are now pushing Hadoop, NoSQL and big data analytics to the next level.
DataStax Plays Cassandra Three Ways
Apache Cassandra is an open-source, column-group style NoSQL database that was developed by Facebook and inspired by Amazon's Dynamo database. DataStax is a software and commercial support provider that can implement Cassandra as a stand-alone database, in conjunction with Hadoop (on the same infrastructure) or with Solr, which offers full-text-search capabilities from Apache Lucene.
The combination of Cassandra and Hadoop on the same cluster is attractive. There are some performance tradeoffs in the bargain, but Cassandra as implemented by DataStax offers a few scalable and cost-effective options. A big appeal with this NoSQL database is CQL (Cassandra Query Language) and the JDBC driver for CQL, which provide SQL-like querying and ODBC-like data access, respectively. Implemented in combination with Hadoop, you can also use MapReduce, Hive, Pig and Sqoop. Use of Solr is separate from Hadoop, but capabilities include full-text search, hit highlighting, faceted search, and geospatial search.
The two biggest threats to Cassandra, and thus to DataStax, are HBase (now used by Facebook) and DynamoDB, Amazon's cloud-based service based on Dynamo. The bigger threat appears to be HBase, as the entire Hadoop community is working on maturing that Hadoop component into a stable, high-performance, easy-to-manage NoSQL database that's available as part of the same platform. Success will likely take some of the wind out of Cassandra's sails (and out of DataStax's sales). For now, HBase is still perceived as green while DataStax customers like Constant Contact, Morningstar and NetFlix attest to stability, scalability and performance on Cassandra today.