SAS Gets with the (Open Source) Program

A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS. In just one month, SAS's position swerved from disdain to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.

Seth Grimes, Contributor

March 5, 2009

5 Min Read

A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS, the largest independent BI and analytics vendor. In just one month, SAS's position swerved from disdain -- the Times quoted Anne H. Milley, SAS director of technology product marketing, as opining, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet." -- to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.The February SAS press release quoted above says an R interface for SAS/IML Studio -- IML is the Interactive Matrix Language -- is scheduled for release this summer. "'This is just the first step,' said Radhika Kulkarni, vice president of advanced analytics. 'We are busy working on an R interface that can be surfaced in the SAS server or via other SAS clients.'"

Why R? The language is algebraic, relatively easy to learn, and powerful, with many analysis, visualization, and interface packages available for free download. It's great for prototyping algorithms and for exploratory analysis. R has its limitations, for instance scalability, but robustness and reliability are not among them.

There's a certain irony in SAS's running R code. It's not that SAS, with its reputation as a closed, monolithic platform, has (so far as I can tell) resisted open source applications (above the tool and server level) more stubbornly than any other large software vendor that I can think of. It's that SAS has embarked on a strategic partnership with data-warehousing powerhouse Teradata to optimize selected SAS analytics code for execution within the Teradata engine, and now here we have SAS linking to a foreign analytical package.

With an interface to R, SAS will join company with analytical database vendors such as Greenplum and Truviso, which both support in-DBMS execution of code written in R and a variety of other languages. How does that work? I believe that in both cases, that support stems from the products' derivation from the open-source PostgreSQL DBMS with its PL/xxx server programmability via languages that include C, Java, Python, Perl, Ruby, and R. Here's how Mike Franklin, Truviso co-founder and CTO, puts it, regarding R specifically: "One of the great benefits of our PostgreSQL roots is our ability to leverage the PostgreSQL ecosystem. As a result, we not only can run PL/R, but we can actually run PL/R over continuous streaming data. We do have customers who have done this, mostly in the financial trading and on-line advertising optimization areas."

But back to SAS... New York Times reporter and blogger Ashlee Vance's work appears to have been instrumental in the company's grudging acceptance of R. Follow-on blog articles include helpful background material and expand on the article that was published in the paper. A January blog article notes R's origin as an implementation of the AT&T developed S programming language and that a private company, Revolution Computing, now offers commercial support and extensions for R. These extensions (not reported in Vance's blog) include parallelization and solutions for domains including life sciences and finance. It's commercial support such as this that helps community-developed open-source scale to meet enterprise needs.

(Vance reported on SAS's R plans in his blog on February 16.)

SAS is a very deliberate software publisher, by which I mean that the company puts very significant resources into R&D and customer support and doesn't invest in fads. SAS would not have expressed a commitment to R if it hadn't been contemplating the move for a while and if it didn't mean to carry through. Knowing SAS, I suspect the company will start contributing to the R project, to the benefit of the wide community of R users. SAS has, after all, supported and encouraged the growth of a large and active user community of its own, just the type of folks who are into the collaboration and sharing that underpin the most successful open-source projects.

SAS community forays into open-source code sharing have arisen from time-to-time, for instance the OS3A Open Source SAS Software Applications project, which "develops and distributes SAS software applications and programs and systems programmed in other languages useful for SAS software users," although it seems the efforts have been slow to develop and have usually petered out. Perhaps SAS's embrace of R, even if limited for now, will have larger repercussions, encouraging a more extensive SAS embrace of open source with a boost to collaborative, cooperative culture at SAS, which would benefit the company, its customers and users, and the wider analytics and BI software markets alike.A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS. In just one month, SAS's position swerved from disdain to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.

Read more about:

20092009

About the Author(s)

Seth Grimes

Contributor

Seth Grimes is an analytics strategy consultant with Alta Plana and organizes the Sentiment Analysis Symposium. Follow him on Twitter at @sethgrimes

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights