Expert Analysis: You Can Predict That R Will Succeed

Adoption by IBM SPSS, SAS and Information Builders can't hurt. But REvolution Computing may do for R what RedHat did for Linux.

InformationWeek Staff, Contributor

March 3, 2010

4 Min Read

David Stodder"To learn from experience, predictive analytics is it," proclaimed Eric Siegel at last month's Predictive Analytics World event. "By generating a model over your data, your organization is essentially learning from its aggregate, collective experience. And that," added Siegel, the president of Prediction Impact and chair of the PAW event, "makes prediction the flagship value proposition of all things BI."

As more organizations recognize the importance of using data as a strategic asset, the tantalizing power to predict customer behavior, uncover fraud, anticipate manufacturing hiccups and speed insight from big data will drive mainstream interest in going beyond standard business intelligence and data warehousing. Yet, the assumption has been that with only so many PhDs to go around and limited budgets to pay for expensive tools, organizations find it difficult to move forward.

However, as experience with predictive analytics grows, at least the data preparation and scoring tasks -- often the lion's share of the effort -- will become less specialized and better served by software. As independent consultant and analyst Neil Raden noted during his talk, "the day in the life of a quant is filled with mostly artisan work." As for all the steps they perform in the extraction and preparation of data for predictive analysis? "You can train people to do this work," he said. "They don't have to be PhDs."

Possibly the most important factor influencing the spread of predictive analytics is the growing popularity of R, the open-source language and development environment for statistical computing and graphics. Intelligent Enterprise recognized the R Project as a 2010 "One to Watch." Vendors, including IBM SPSS, Information Builders and SAS are incorporating R. As a GNU project, R will make it easier for developers to cost-effectively incorporate predictive analytics tools and algorithms into a variety of applications, services and systems.

Predictive Analytics World hosted an R Project community meeting that featured a rare talk by John Chambers, who at Bell Laboratories (now Lucent Technologies) developed S, R's predecessor. He retired from Bell Labs in 2005 after 40 years and is now a consulting professor of Statistics at Stanford University. He remains a strong influence on the development of R. Chambers offered a fairly technical discussion of points of interest, but one general point that struck me was that R is designed to fit with other approaches.

"No one system or paradigm will do everything, and fortunately, there are a lot of good ways of computing and building interfaces to help us link paradigms together," Chambers said. "But linking paradigms together is not a new idea; it is one of the two or three core ideas of S, going back to the very beginning."

One vendor to watch in the R space is REvolution Computing. Providing software and support to statisticians, data analysts and others using R to develop models and analyze results, the company could become the "Red Hat" of the R community. As Red Hat did with Linux, REvolution Computing is quick to incorporate changes from the R project into its toolkit so that users do not have to monitor developments themselves. An important focus has been to enable developers and statisticians to use parallelism and take advantage of multiprocessor systems and networked computers.

REvolution Computing is also gathering industry strength and backing. At the conference, the company announced that C. Hadlai "Tex" Hull, one of the founders of SPSS, would be joining the company as a technical and business advisor. REvolution Computing's president and CEO is Norman Nie, who was also a co-founder of SPSS and thus is now reunited with his former colleague. The company's CTO is David Champagne, who served as principal architect and engineer for SPSS. And for R credentials, one of the foundational developers of R, Robert Gentleman, serves on the company's board. In addition to technical advice, Gentleman will help the company grow through connections in the pharmaceutical and bioengineering industry; he is a senior director of Bioinformatics at Genentech (now part of the Roche Group). Finally, REvolution Computing is bringing on board experienced software industry hands; the company announced on March 2 that Zack Urlocker, most recently executive VP of MySQL, has joined the board of directors.

So, it's a safe bet that R will grow in use and influence in predictive analytics. What remains to be seen is how soon predictive analytics tools will become pervasive outside the confines of the statistics and data mining community.

David Stodder is an independent analyst, writer and researcher focused on innovative uses of information to achieve business objectives. Along with heading up his own firm, Perceptive Information Strategies, he is a Research Fellow with Ventana Research

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights