The No-Sacrifice, Affordable Data Warehouse App

Smart planning and open source technologies can cut away huge costs normally associated with enterprise data warehouse applications. Budget-constrained large enterprises have put these tips to the test, and smaller enterprises can benefit from them, too.

Lintel Works

Microsoft, among other vendors, often uses FUD to raise concerns about Linux. Just stop listening to this market hype! Linux works, period. And it's an excellent option for small, midsized, and even large companies. It works on single-CPU systems as well as clusters. But don't take my word for it; look at the growing body of evidence. For example, Linux clusters are used by some of the largest firms in the world, such as ETrade, OfficeMax, AT&T, J.C. Penney, Google,, Yahoo, and American Express, just to name a few.

With a Linux/Intel architecture, you not only have an operating system that gets around the Windows file system sluggishness, but also an environment that's considerably less expensive than competitive technology like pSeries/AIX or Solaris/Sun. Essentially, you have the best of both worlds: a solid Unix-version operating system that's relatively inexpensive.

Private industry and governments alike are taking advantage of Linux. From the private sector, the total cost of ownership is much lower with Linux implementations than with typical technology offerings. Moreover, Linux is more efficient. The product allows companies to customize the operating system to their particular needs, eliminating resource overhead for functions that may be irrelevant for the task at hand. With products like Windows, customizing it to your needs is impossible. There's simply no flexibility in most operating systems offered in the market today, except for Linux.

Governments are embracing Linux as well, for many of the same reasons the private sector is. Linux represents better value and offers more flexibility. The Israeli government recently announced it's standardizing on Linux. China, Germany, and other countries have announced or are in the middle of migration plans, while Russia, the United Kingdom, and Brazil are exploring the technology.

Moreover, as Linux momentum gains, the number of applications that run on the technology grows. At this point, virtually every leading Unix application runs on Linux. This list includes all the big names, like IBM and Oracle product offerings, as well as a growing number of smaller players that offer important technology, like Viador's BI portal.

Of course many will argue that much of the savings found in Linux-based systems year-to-year are more a factor of relentless cost cutting for hardware and database vendors as opposed to just the operating system. While there's some truth to that argument, the simple fact remains that Linux provides unarguable savings. A Linux license of SuSE for a 64-bit AMD Opteron, 2GHz, 8-way processor system is less than $3,000 — that's a huge savings in anyone's book!

If your budget is tight and your requirements are significant, ignore the FUD ranting and look closely at a Linux/Intel platform.

MySQL: Open Source, High Quality

Another open source offering with good features and solid performance is MySQL. Like other open source products, MySQL is considerably less expensive than competitors and, for many applications, as functional. Companies such as Google, Toyota, Intel, DaimlerChrysler, Bayer, Colgate, and Yamaha (just to name a few) have all effectively used MySQL in their applications.

There are several reasons why MySQL is so popular. For example:

Performance: It's fast, stable, and easy to use.

Proven: The product at last count had more than four million active installations.

Inexpensive: As an open source product, it's developed and marketed at a fraction of what vendors like Oracle and Microsoft spend on their databases. These savings are generally passed on to the customer.

Charles Garry, an analyst at Meta Group, described MySQL as "a disruptive technology," upsetting the entire database market. It's easy to see why, with millions of users, solid performance and stability, and a really low price. As Figure 1 shows, the price for standard and enterprise licenses from Oracle, IBM, and Microsoft for a single CPU are in the range of about $5,000 to $40,000. Now consider that MySQL Pro costs $595 per server.

Cox Communications, for instance, points to an application based on MySQL costing less than $90,000, for everything from hardware to annual licenses and support. An Oracle database license by itself was estimated at $300,000. Now that's savings! But Cox isn't the only story available. A NASA procurement office migrated from Oracle to MySQL because a license upgrade for Oracle was going to cost twice as much as its entire budget.

But customers aren't the only ones who see the value of MySQL. SAP, for instance, has a certified version of MySQL called MaxDB. With this version, you can dramatically reduce the cost of your SAP implementations, without sacrificing scalability (which is in the terabyte range), performance, or the administration features we've all come to expect from enterprise database technology.

Figure 1
Figure 1: The price for standard and enterprise licenses from Oracle, IBM, and Microsoft for a single CPU are in the $5,000 to $40,000 range. MySQL Pro costs $595.

Two Ways to Save

Whether you're trying to squeeze ambitious BI objectives into a tight budget or you simply want to maximize every dollar you spend on your BI effort, then remember the following two things.

First, by knowing more about the data you source (with a data quality audit) and the application you're attempting to build (with a prototype), you reduce risk. And, by reducing risk, you save money. Period. Never, ever confuse risk reduction with more work or more expenses. That's simply wrong. Any risk mitigation should be performed early in your project effort, before you're fully committed. Only then can you modify your budget, requirements, or both. The wrong time to negotiate change is at the back end of design and development.

Second, open source products such as Linux and MySQL are solid and proven technology for BI efforts and are relatively inexpensive. Therefore, these products should be on your short list. Don't let vendors feed you FUD. As you know, vendors are often driven by personal gain and don't have your best interests at heart. Open source products may not ultimately fit your environment or culture, but that should be your decision.

Michael L. Gonzales is the president of The Focus Group Ltd., a consulting firm specializing in data warehousing. He has written several books, including IBM Data Warehousing (Wiley, 2003). He speaks frequently at industry user conferences and conducts data warehouse courses internationally.