Splice Machine promises SQL- and ACID-compliant RDBMS for analytics and transaction processing on a super-scalable, low-cost Hadoop platform.
a NoSQL database was out of the question. At the same time, running Oracle RAC for a series of 10-terabyte to 20-terabyte instances for each Harte Hanks customer was getting to be expensive.
"We have a lot of investment in things that run on SQL, including Cognos, Unica, ETL work, data cleansing, customer roll-ups and models, and staff that are doing analytics with SAS and SPSS," said Robert Fuller, managing director of product innovation at Harte Hanks, in a phone interview with InformationWeek. "With Splice Machine we can still work with all that, but we're getting the benefits of Hadoop scaling and performance as well as lower-cost hardware and lower-cost software."
By way of comparison, Fuller said adding six nodes to a Hadoop cluster requires $25,000 worth of hardware, whereas adding equivalent capacity with Oracle RAC and a separate storage area network would cost more than $100,000 just for the hardware. Add the software licenses, and "you're not doubling or tripling the cost, you're ten times the cost."
Splice Machine last summer demonstrated in its own labs that it could run the Harte Hanks applications and beat Oracle RAC performance. By the end of last year, Harte Hanks built out a Cloudera cluster and proved that it could replicate that performance using customer data in its own datacenters.
"One of the common campaign-performance queries that we've tested takes about 183 seconds in our production Oracle RAC deployment, and it's taking less than 20 seconds on Splice Machine on a nine-node Cloudera cluster," says Fuller.
The next step for Harte Hanks is to build out replication and high-availability features and take Splice Machine into production. Fuller has not had to hire new staff to learn how to deploy and use Hadoop thus far, but that may change, he says, when Harte Hanks starts taking advantage of MapReduce processing, as well as SQL OLTP and analysis on top of Hadoop.
The next step for Splice Machine hinges in part on the pending 1.0 release of HBase, says Zweben, noting that this foundation of the Hadoop ecosystem is still at the 0.95 release stage. Splice Machine 1.0 will be generally available sometime this year, he vows, but he notes that the Splice Machine public beta release now available for download is suitable for production deployment.
"HBase powers RocketFuel, a company that handles on the order of 15 petabytes of advertising optimization data a day," says Zweben, who is a member of RocketFuel's board of directors. "Our beta system is ready to be put into operation today."
Splice Machine's apparent success in doing it all on Hadoop makes one wonder if the commercial database incumbents can and will follow suit.
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators. Read our InformationWeek Elite 100 issue today.
Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business wonít wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.