Microsoft, Sybase and Vertica Raise Data Warehouse Ante - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
2/26/2009
02:02 PM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Microsoft, Sybase and Vertica Raise Data Warehouse Ante

This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

The first of this week's announcements came from Microsoft with its release of its Fast Track Data Warehouse reference architectures. These preconfigured, SQL Server-ready 4-terabyte to 32-terabyte server-and-storage bundles are akin to Oracle's Optimized Warehouses and IBM's Balanced Configuration Units. But in Microsoft's case they're also billed as a stepping stone to Microsoft's Project Madison release, which will take SQL Server into the hundreds of terabytes with massively parallel processing (MPP) and scale-out architecture.How can a non-MPP (symmetric multiprocessor) appliance sold today be a stepping stone to an MPP-based system to be offered by next year? "If you want to move to a multi-node architecture, you can do that through what we call a hub-and-spoke architecture," says Herain Oberoi, Group Product Manager, SQL Server. "The hub would be an MPP-based Madison deployment, and that would sync with the individual spokes with high-speed data transfer capabilities."

In other words, the Fast Track Warehouse(s) you build today could later become spokes on a Madison/MPP-based hub that would create the enterprise data warehouse. That doesn't mean, however, that the spokes you build today have to be data marts, says Oberoi: "Some customers will use these as full data warehouses, it's just that they'll tend to have more of a departmental focus. When the time is right, they can add an MPP hub for scalability and extreme processing power."

The second of this week's announcements was the release of Sybase IQ 15, an upgrade that brings several performance enhancements to what is undisputedly the leading column-oriented database with more than 1,500 active customers. Key upgrades include improved scalability in grid environments, streamlined query algorithms for faster query execution and multi-node loading that speeds time to query. Earlier this week I talked to Asif Rahman at Loan Performance. This Sybase IQ customer is a division of insurance giant FirstAmerican that tracks the performance of mortgage loans. (My first question was, "so you guys were in a position to prevent the big subprime mortgage mess we're in, eh?" Rahman responded that most of the loans the firm tracks are those that are held by the originators rather than those that are securitized and sold off to the likes of Fannie Mae and Freddie Mac - which are said to account for the bulk of the troubled loans).

Interestingly enough, Loan Performance initially launched the warehouse behind its True Standings analysis product on Microsoft SQL Server, but the data volumes and query complexity soon proved to be too much. "When we rolled out in 2004, end users were thrilled because they could build reports from scratch and drag and drop any field they wanted," says Rahman, director of application development. "Unfortunately, people soon started to complain about the query performance and we were also having a difficult time updating the database."

After considering Oracle, Netezza and a higher-horsepower deployment of SQL Server, Loan Performance switched to Sybase IQ in late 2005 because it concluded that "a general-purpose database would not work for us," Rahman explains. "With Sybase IQ, we can add fields to an analysis without worrying that it will slow down the queries."

It should be noted that SQL Server 2008 has since introduced a "resource governor" feature and improved compression capabilities aimed at enhanced scalability. Project Madison's MPP-architecture will take scalability and performance to even further extremes, but even then I doubt it will match Sybase IQ, Vertica or any other column-oriented database in terms analytic query performance. When the task is querying selected attributes stored in columns, row-oriented databases like Oracle, Microsoft SQL Server and IBM DB2 just can't keep up, even with the aid of parallel processing.

Rahman says Loan Performance is beta testing Sybase IQ 15, and he's particularly interested in the multi-node writing capability and extended support for parallel processing. "Right now we have only one writer, but we have two nodes in production and [the multi-node] feature would cut down our update times," he explains. "As for the parallelism, we see some support for that in older versions of IQ, but they've refined it in IQ 15, and without making changes in our hardware, we've seen 15% to 20% improvements in performance."

Loan Performance customers query the True Standings database directly, and response times currently range from sub-second to as long as five minutes, depending on how many millions or billions of records are being explored. A 15% to 20% improvement in query performance would mean that much higher customer satisfaction, says Rahman.

The third and final announcement this week was from Sybase IQ rival Vertica, which introduced a Vertica Virtualized Analytic Database that runs in a VMware virtual machine. This option gives data warehouse pros an option to quickly add processing horsepower when spiky applications, seasonal demand, one-time projects or pilot tests would swamp fixed deployments. Costs start at $100,000 for a 1-terabyte deployment.

My takeaway on this week's news is that the options for data warehousing just keep getting better and more numerous while competition and Moore's Law keep increasing the performance and capacity per dollar. Unlike some categories I cover, the market seems dynamic, fast-moving and anything but commoditized, despite the move to commodity hardware.This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
News
COVID-19: Using Data to Map Infections, Hospital Beds, and More
Jessica Davis, Senior Editor, Enterprise Apps,  3/25/2020
Commentary
Enterprise Guide to Robotic Process Automation
Cathleen Gagne, Managing Editor, InformationWeek,  3/23/2020
Slideshows
How Startup Innovation Can Help Enterprises Face COVID-19
Joao-Pierre S. Ruth, Senior Writer,  3/24/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
IT Careers: Tech Drives Constant Change
Advances in information technology and management concepts mean that IT professionals must update their skill sets, even their career goals on an almost yearly basis. In this IT Trend Report, experts share advice on how IT pros can keep up with this every-changing job market. Read it today!
Slideshows
Flash Poll