informa
/
Commentary

Why Not Data Warehouse Appliances?

In my book, it's time to stop thinking of data warehouse appliances (including those powered by column-store databases) as experimental devices for pioneers and performance nuts... Will these devices start replacing conventional enterprise data warehouses (EDWs)? I haven't heard many solid arguments against the appliance approach.
In my book, it's time to stop thinking of data warehouse appliances (including those powered by column-store databases) as experimental devices for pioneers and performance nuts. Having personally interviewed more than a handful of appliance customers, my sense is that we're on the cusp of a broad adoption phase. Will these devices simply compliment conventional data warehouses as the foundation for data marts and non-mission-critical apps? Or will they also start replacing conventional enterprise data warehouses (EDWs)? I haven't heard many solid arguments against the appliance approach.Just last week I scored an interview with appliance customer Reliance Communications, which is the Verizon or AT&T of India. The company has some 40 million customers and a growth rate that would be the envy of any executive; Reliance is adding some 1.5 million customers per month, thanks in large measure to India's growing economic strength and emerging middle class.

So how is Reliance coping with 1 billion new call data records each day swelling the company's 40-terabyte data warehouse? After exploring the field of data warehouse appliances in early 2007, Reliance implemented a 60-terabyte Greenplum appliance last summer, and it now has another 120-terabyte Greenplum implementation in the works. All 180 terabytes of capacity will be dedicated to call data records, which have to be kept around for 13 months for compliance reasons. Queries typically involve vast quantities of data.

"Greenplum was really new technology for us, so we wanted to start with the CDRs," says Raj Joshi, Vice President of Decision Support Systems. "Access to CDRs is not very frequent, but they need to go in a big database, and we wanted to address our biggest problem first."

The advantages of the appliance route? "I can't comment on our final costs, but the savings were substantial," says Joshi. "As far as performance goes, it's about three to five times faster [than our old warehouse], so the queries that were taking a couple of hours now take 30 minutes."

I've talked to a number of other companies with DW appliance deployments:

The New York Stock Exchange has multiple EDWs on Netezza Appliances. Capital Equity firm Arsenal Partners and its Sermatech business unit have an EDW on HP's Neoview, and HP points to about a dozen other customers that have gone public, including WalMart.

Trade Doubler, a European Web marketing firm, is using InfoBright's Brighthouse appliance to analyze Web clickstreams (a case study I have yet to write up). Corporate Express, the office supply giant, is running a data mart on Netezza. Executive Matt Schwartz described the deployment as a "go-fast sports car" as compared with a family sedan, suggesting that the maturity and versatility of conventional databases still appeals.

Yes, all of these customers proceeded with caution, knowing that DW appliances aren't the proven way, but best practices are emerging quickly. True, not all appliance vendors currently offer the depth and breadth in supporting data integration and data quality software that the database incumbents can offer, but third-party vendors are quickly stepping in and IBM, for one, has joined the appliance market.

Point taken, not all appliances can handle mixed query loads or vast numbers of users, but several can, and these ranks will surely grow with maturity. The bottom line is that these and many other appliance customers are getting great performance and they are spending less money. And then there's Teradata, which has been selling and succeeding with appliances for years, even if they weren't calling them that.

So my question is, what are the arguments against DW appliances? I'm sure there are other cases to be made, but I'm just not hearing them. Point me to a credible white paper!

In the absence of a strong case against appliances, I have to believe that only maturity and product diversity stand between the data warehouse market as we know it today and one dominated by appliances.In my book, it's time to stop thinking of data warehouse appliances (including those powered by column-store databases) as experimental devices for pioneers and performance nuts... Will these devices start replacing conventional enterprise data warehouses (EDWs)? I haven't heard many solid arguments against the appliance approach.