ParAccel Lowers the Cost of High-Performance BI - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
12/4/2007
10:01 AM
mmadsen
mmadsen
Commentary
50%
50%

ParAccel Lowers the Cost of High-Performance BI

ParAccel announced top TPC-H benchmark numbers with Sun at the end of October, beating out the former leaders in both the price and price-performance. Not by a little, but by four times in performance with a big drop in cost. The fact that a little startup like ParAccel can enter the market with a database to support BI that beats the TPC-H results of all the major vendors should wake people up.

ParAccel announced top TPC-H benchmark performance numbers with Sun at the end of October, beating out the former leaders in both the price and price-performance. Not by a little, but by four times in performance with a big drop in cost. I haven't seen much discussion of these results.

The fact that a little startup like ParAccel can enter the market with a database to support business intelligence that beats the TPC-H results of all the major vendors on both performance and price should wake people up. Particularly when the performance increase is so large while significantly decreasing cost.What's different about ParAccell's database? They're using a columnar data store rather than the row-oriented data storage model most vendors use. This results in significant IO reductions and allows for more effective use of compression. Their model is similar to (but not the same as) Sybase IQ. The TPC-H numbers demonstrate a difference pretty clearly.

It's surprising that there aren't more columnar storage engines out there, particularly since this is not visible to users yet has such a significant performance advantage for query workloads. People always ask about Vertica whenever I mention "columnar database." Vertica has been floundering around for quite a while now with very little to show in the way of accomplishments.

The other interesting element in ParAccel is that they run in a shared-nothing configuration, which most in the VLDB arena agree is the only way to scale to very large data volumes. It also makes scaling more cost-effective, which is why appliance vendors like Netezza, new database vendors like Greenplum and the old guard at Teradata are all running shared nothing architectures (these are all row-oriented data stores)

ParAccel is offering three modes of deployment - a straight database, a virtual appliance, and a preconfigured hardware appliance (though I don't have a feel for how "appliancey" their offering is).

Appliances are helping to overcome the resistance to alternative databases. Demand for BI performance is exceeding the ability of traditional platforms to keep pace. The problem with one standard database for both transaction and analysis workloads is the constantly rising data volumes and users repeating the mantra of "faster queries."

Enterprise IT has been trying to consolidate database vendors for years, but data warehousing workloads add complexity to the traditional database model. Over time, the database connection and SQL standards have improved, along with database manageability, to make having multiple databases less of a concern in IT.

We already understand that different schema designs are required. The special requirements that BI and analytics bring to the database are leading people to the realization that different database platforms can make sense.

The TPC-H announcement by ParAccel shows that a different database is viable, just like Netezza and Datallegro did in the appliance space.

Personally, I'm not a big fan of TPC benchmarks because they don't relate well to real-world performance or configurations. However, they are useful for determining what databases or hardware-database combinations are in roughly the same class, and how realistic it is to run them at that configuration.

One big problem is the optimization for one of the two TPC-H metrics (performance or price-performance). Vendors run different configurations for these two numbers to get the best metric, which means the configuration that's best in performance may be completely unreasonable from a price perspective. For example, few people are going to blow $11 million on hardware for a 3 TB data warehouse configuration.

ParAccel and Sun's benchmarks were run at the 100GB, 300GB and 1TB scale factors and hold the top slot in all three. They did this with the same hardware configuration. That's unusual in the TPC-H, as is holding the top slots for both performance and price-performance.

I haven't heard from them about whether they will run the 3TB, 10TB or 30TB configurations. I suspect price-performance won't be as impressive at those scales because they may need to shift from internal storage to more costly external storage arrays and bump up the cost.

By the way, the Sun-ParAccel benchmark was run on Linux. Go penguins!

Mark Madsen is president of Third Nature, a consulting and research firm focused on business intelligence, data integration and data management. He is a principal author of Clickstream Data Warehousing and speaks about data warehousing and emerging technology. Write him at [email protected].ParAccel announced top TPC-H benchmark numbers with Sun at the end of October, beating out the former leaders in both the price and price-performance. Not by a little, but by four times in performance with a big drop in cost. The fact that a little startup like ParAccel can enter the market with a database to support BI that beats the TPC-H results of all the major vendors should wake people up.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

News
Remote Work Tops SF, NYC for Most High-Paying Job Openings
Jessica Davis, Senior Editor, Enterprise Apps,  7/20/2021
Slideshows
Blockchain Gets Real Across Industries
Lisa Morgan, Freelance Writer,  7/22/2021
Commentary
Seeking a Competitive Edge vs. Chasing Savings in the Cloud
Joao-Pierre S. Ruth, Senior Writer,  7/19/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Monitoring Critical Cloud Workloads Report
In this report, our experts will discuss how to advance your ability to monitor critical workloads as they move about the various cloud platforms in your company.
Slideshows
Flash Poll