Splice Machine SQL Database Scales On Hadoop - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
02:55 PM
Connect Directly

Splice Machine SQL Database Scales On Hadoop

Splice Machine promises SQL- and ACID-compliant RDBMS for analytics and transaction processing on a super-scalable, low-cost Hadoop platform.

10 Hadoop Hardware Leaders
10 Hadoop Hardware Leaders
(Click image for larger view and slideshow.)

Promising the best of both worlds, startup Splice Machine last week announced the latest stab at putting SQL on Hadoop, but this time it's a fully SQL-compliant and ACID-compliant relational database management system (RDBMS) on Hadoop that's not just for analytics.

"Splice Machine can replace Oracle, Microsoft SQL Server, IBM DB2, or MySQL, where those systems might hit the wall from a performance or cost perspective," said Monte Zweben, CEO of Splice Machine, in a phone interview with InformationWeek.

Hadoop provides the scale-out technology for Splice Machine, so it runs on scalable, commodity clusters. At the same time it's compatible with existing investments in SQL-based business intelligence software, ETL systems, and applications through an ODBC/JDBC driver.

[Want more on Hadoop options? Read Pivotal Subscription Points To Real Value In Big Data.]

Several databases have been ported to run on top of Hadoop, including Pivotal's Greenplum database (through HAWQ) and InfiniDB, but these are specialized databases designed for high-scale querying and analysis. Splice Machine, which marries the open-source Apache Derby Java-based database with Hadoop's HBase NoSQL database, touts RDBMS-speed transaction processing.

"Our unique differentiation is that we're the only [SQL-on-Hadoop option] that can support concurrent reads and writes in a transactional context with ACID compliance," says Zweben.

Splice Machine uses a concurrency control method called "snapshot isolation" in combination with HBase, which has ACID properties over updates in a single table. The Apache Derby SQL planner and optimizer have been extended to take advantage of Hadoop's parallel architecture, according to Zweben. As plans are executed on each node, they're spliced back together -- thus the name of the company.

"We started with two well established open-source stacks, Derby and Hadoop, and that's one of the reasons we can come to market so quickly," says Zweben. Splice Machine was founded in 2012.

With last week's introduction, Splice Machine entered public beta, but the company says it has 15 charter customers in industries including digital marketing, telecom, and high-tech. One of those customers is well known marketing services firm Harte Hanks, which has been testing Splice Machine since last summer.

Harte Hanks is poised to replace Oracle RAC in a campaign-management application that combines IBM Unica, IBM Cognos reporting, Ab Initio data-integration software, and Trillium data-cleansing technologies. All of the above are designed to run on or work with SQL RDBMSs, so moving the app onto Hadoop or

Next Page

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
5/19/2014 | 4:49:47 PM
Cloudera or MapR: Harte Hanks Considers its options
It's interesting to note that Harte Hanks -- the Splice Machine customer interviewed for this article -- hasn't yet settled on Cloudera for the long term. Harte Hanks needs dedicated database instances of Splice Machine for each customer, so it needs separate instances. On Cloudera it has to be separate physical instances, but on MapR it could be virtual instances.

"Cloudera uses more of the open source stack and fewer proprietary pieces than MapR," said Harte Hank's Robert Fuller, explaining his initial choice of Cloudera. "MapR now promises to support all of the open source pieces of Hadoop but at the same time their proprietary piece offer substantial benefits."

MapR has invested heavily in multi-tenancy, for example, Fuller explained. In Cloudera or Hortonworks, you have to tune the cluster to your applications, but you have to do it cluster-wide. "MapR has done a lot of work to make those settings shardable within the cluster, so you can make certain servers run in one configuration and others in a different configuration," Fuller said.

Harte Hanks is only talking to MapR at this point, and it would have to prove that a Splice Machine deployment running on MapR could run all of Harte Hank's software and give it virtual cloud deployment flexibility.
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Becoming a Self-Taught Cybersecurity Pro
Jessica Davis, Senior Editor, Enterprise Apps,  6/9/2021
Ancestry's DevOps Strategy to Control Its CI/CD Pipeline
Joao-Pierre S. Ruth, Senior Writer,  6/4/2021
IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll