Software // Information Management
Commentary
3/5/2009
00:09 AM
Seth Grimes
Seth Grimes
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

SAS Gets with the (Open Source) Program

A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS. In just one month, SAS's position swerved from disdain to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.

A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS, the largest independent BI and analytics vendor. In just one month, SAS's position swerved from disdain -- the Times quoted Anne H. Milley, SAS director of technology product marketing, as opining, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet." -- to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.The February SAS press release quoted above says an R interface for SAS/IML Studio -- IML is the Interactive Matrix Language -- is scheduled for release this summer. "'This is just the first step,' said Radhika Kulkarni, vice president of advanced analytics. 'We are busy working on an R interface that can be surfaced in the SAS server or via other SAS clients.'"

Why R? The language is algebraic, relatively easy to learn, and powerful, with many analysis, visualization, and interface packages available for free download. It's great for prototyping algorithms and for exploratory analysis. R has its limitations, for instance scalability, but robustness and reliability are not among them.

There's a certain irony in SAS's running R code. It's not that SAS, with its reputation as a closed, monolithic platform, has (so far as I can tell) resisted open source applications (above the tool and server level) more stubbornly than any other large software vendor that I can think of. It's that SAS has embarked on a strategic partnership with data-warehousing powerhouse Teradata to optimize selected SAS analytics code for execution within the Teradata engine, and now here we have SAS linking to a foreign analytical package.

With an interface to R, SAS will join company with analytical database vendors such as Greenplum and Truviso, which both support in-DBMS execution of code written in R and a variety of other languages. How does that work? I believe that in both cases, that support stems from the products' derivation from the open-source PostgreSQL DBMS with its PL/xxx server programmability via languages that include C, Java, Python, Perl, Ruby, and R. Here's how Mike Franklin, Truviso co-founder and CTO, puts it, regarding R specifically: "One of the great benefits of our PostgreSQL roots is our ability to leverage the PostgreSQL ecosystem. As a result, we not only can run PL/R, but we can actually run PL/R over continuous streaming data. We do have customers who have done this, mostly in the financial trading and on-line advertising optimization areas."

But back to SAS... New York Times reporter and blogger Ashlee Vance's work appears to have been instrumental in the company's grudging acceptance of R. Follow-on blog articles include helpful background material and expand on the article that was published in the paper. A January blog article notes R's origin as an implementation of the AT&T developed S programming language and that a private company, Revolution Computing, now offers commercial support and extensions for R. These extensions (not reported in Vance's blog) include parallelization and solutions for domains including life sciences and finance. It's commercial support such as this that helps community-developed open-source scale to meet enterprise needs.

(Vance reported on SAS's R plans in his blog on February 16.)

SAS is a very deliberate software publisher, by which I mean that the company puts very significant resources into R&D and customer support and doesn't invest in fads. SAS would not have expressed a commitment to R if it hadn't been contemplating the move for a while and if it didn't mean to carry through. Knowing SAS, I suspect the company will start contributing to the R project, to the benefit of the wide community of R users. SAS has, after all, supported and encouraged the growth of a large and active user community of its own, just the type of folks who are into the collaboration and sharing that underpin the most successful open-source projects.

SAS community forays into open-source code sharing have arisen from time-to-time, for instance the OS3A Open Source SAS Software Applications project, which "develops and distributes SAS software applications and programs and systems programmed in other languages useful for SAS software users," although it seems the efforts have been slow to develop and have usually petered out. Perhaps SAS's embrace of R, even if limited for now, will have larger repercussions, encouraging a more extensive SAS embrace of open source with a boost to collaborative, cooperative culture at SAS, which would benefit the company, its customers and users, and the wider analytics and BI software markets alike.A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS. In just one month, SAS's position swerved from disdain to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.