Microsoft And Oracle Are Scaling Out

New products involve running large data warehouses on clusters of small, low-cost servers.
Oracle and Microsoft, in a bid to land midsize data warehouse customers, are pitching new products aimed at a "scale-out" option--running large data warehouses on clusters of small, low-cost servers.

By bringing products for highly parallel architectures to midmarket users, two of the largest database vendors are acknowledging there are multiple dimensions to scalability. Companies can have as little as a terabyte of data but use complex queries or schemas, or have lots of people accessing the data. Such users often find they need a scale-out architecture.

Oracle last month rolled out the HP Oracle Exadata Storage Server and the HP Oracle Database Machine, both designed to raise performance for data warehouse queries. Oracle's products use the Exadata storage cell as a building block, relying on low-cost Hewlett-Packard hardware and intelligent Oracle software to off-load database processing to the storage tier and increase disk I/O bandwidth. The performance version of an Exadata storage cell will store 1 TB of user data and deliver 1 GBps of raw I/O bandwidth.

The effective bandwidth in processing a query can actually be much greater than 1 GBps per cell because of compression and database operations, such as filtering and projections, performed within the storage cell. This lets Oracle data warehouses offer significantly higher performance, while requiring less space, power, and cooling; they also cost less compared with conventional storage arrays.

At Microsoft's Business Intelligence Conference last week, the company said it will integrate the technology it acquired as part of its purchase of data warehouse appliance vendor DATAllegro with Microsoft SQL Server. The first products are expected in 2010.

Before this move, Microsoft focused on growing data warehouses via scaling up; customers would buy larger SMP servers when they needed a bigger warehouse. This approach has advantages in operational simplicity, but it imposes a ceiling on capacity. Microsoft still touts the scale-up option, but the DATAllegro technology adds a scale-out option.

The scale-out approach isn't new to large-scale data warehousing. Teradata has used it since 1984, IBM since the mid-'90s, and Oracle for nearly 10 years with RAC and now grid computing. HP Neoview and many data warehouse appliance startups emerging this decade are using it, too. In addition to reducing hardware costs, a good scale-out architecture promises modular capacity and potentially little or no disruption for upgrades.

With these latest announcements, Oracle and Microsoft are aiming to capture larger data warehouse deployments. Of course, like all highly parallel architectures, Oracle's and Microsoft's will have their limitations and bottlenecks. And they'll have to prove they're as good as those from vendors that started out with a highly parallel approach.

Illustration by Sek Leung

Return to the story:
Scaling The Data Warehouse

Continue to the sidebars:
EBay Turns To Analytics As A Service
7 Gotchas That Wreck Data Warehouse Scalability