The value and utilization of open-source software have grown immensely in recent years, but in the booming analytics market, open source is strangely not keeping pace.
The value and utilization of open-source software have grown immensely in recent years, but in the booming analytics market, open source is strangely not keeping pace. A variety of projects are out theresearch open-source nexus SourceForge.net for "OLAP" or "reporting," and you'll find dozens. Unfortunately, many of these attempts seem amateurish, more intentions than anything approaching commercial-grade products. The projects that do appear worth a trial run boast only niche rather than broad-market appeal. Nonetheless, they do suggest far greater future adoption of open source for decision systems.
Where's the Linux OLAP?
First and foremost is Mondrian, an open-source OLAP server that accesses relational databases and supports Microsoft's Multidimensional Expressions (MDX) language, as well as the Java OLAP (JOLAP) and XML for Analysis application programming interfaces (APIs). A companion project, JPivot, provides a Java Server Page (JSP) tag library for tables and charts and acts as a Mondrian client. "The feedback I get from users is that they love how simple JPivot is to configurejust write a simple XML file or use the Eclipse plug-in," says lead developer Julian Hyde. "They love the JPivot user interface, and they find it fits seamlessly into their Web applications."
Nigel Pendse, lead author of the "OLAP Report," has found that despite OLAP-vendor Linux support dating back several years, only 2 to 3 percent of the overall OLAP market runs OLAP on Linux.
But let's shift focus to analytics' high-performance computing (HPC) cutting edge, where open source dominates and users program their own analytics rather than looking to buy off the shelf. For those users, performance, reliability and scalability count far more than commercial branding. To succeed, open-source business analytics applications must appeal to that same class of exacting user by providing a similar platform for serious innovation.
A Product and a Process
The JPivot plug-in for Eclipsean open tool-integration platformthat Mondrian developer Hyde mentioned is the tip of the iceberg. Companies including IBM, Ilog, Oracle and SAS have joined the Eclipse Consortium, and reporting vendor Actuate will steward the new Eclipse Business Intelligence and Reporting Tools (BIRT) Project (eclipse.org/birt). But with far less visibility and little press coverage, the R computing system (r-project.org) and a number of related open-source data mining efforts are already winning users over.
R is an open-source implementation of the S statistical programming language. It provides an excellent environment for exploratory data analysis and visualization. Health-care consultant Marc Schwartz says that R has advanced "under the leadership of some of the best minds in the business." A biotechnology research statistician told me that R and its associated contributed packages "simply can't be beat in terms of data analytic and static data visualization capabilities." Others have told me how easy it is to build Windows and Web interfaces for R and also to integrate R and software developed with other languages and environments.
The folks behind the free Open Office productivity application suite say that via open source, they offer "not only a product, but a process." Their mission is to create an office suite that will run on all major platforms and provide access to all functionality and data through open-component-based APIs and an XML-based file format. (Open Office is a worthy product, too, one I use myself except when third-party spreadsheet functions I need are exclusive to Microsoft Excel.) Australian research psychologist and R user Jim Lemon is on board; he likes the idea of a "cooperative effort to produce a stats-oriented programming language."
I see the open-source process changing the way we perform decision support by providing a platform that encourages end-user participation and supports collaborative innovation. Mainstream analytics will evolve slowly and perhaps incompletely toward this model. Open-source platform tools, HPC, R and similar successes will show the way.
SETH GRIMES is a principal of Alta Plana Corp., a Washington, D.C.-based consultancy specializing in large-scale analytic computing systems.
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.