Low-Cost Options For Predictive Analytics Challenge SAS, IBM
Upstarts Revolution Analytics and Alpine Data Labs step up the competition against market leaders.
Four-year-old Revolution Analytics provides commercial support for the R programming language as well as tools, integration, consulting and training aimed at making an open-source product enterprise ready.
Riding R's coattails is a good idea. It's used by 43% of data miners, according to the Rexer's Annual Data Mining Survey, and it has been embraced and supported by commercial software vendors including SAS, SPSS, InformationBuilders, and Tibco.
Revolution's core Revolution R Enterprise deployment is said to improve performance over standard R by adding support for multithreading when using multi-processor and multi-core hardware. A RevoScaleR package offers widely-used statistical algorithms optimized for big-data analysis (meaning tens of terabytes or more) in clustered environments such as Microsoft Windows HPC Server. Revolution says this high-performance computing approach on commodity hardware far surpasses the speed and scalability of conventional analytic servers at a fraction of the cost.
The R community offers plenty of ready-to-run statistical and data-analysis techniques and analytic applications incorporating those methods. Revolution Analytics, which is run by a bunch of former SPSS and SAS executives, provides development, debugging, and deployment tools as well as the aforementioned support and consulting to keep your people productive.
As far as MPP environments are concerned, Revolution runs in database on IBM Netezza. That's a very short list, but the company says it's working on similar partnerships with other leading MPP vendors.
A recent distinction for Revolution is the release of R extension packages to work with Hadoop, the open-source storage and data-processing environment. The packages provide connectivity to the HDFS file system and HBase as well as Hadoop streaming so you can create MapReduce jobs in R for iterative, super-high-scale data processing on Hadoop. MapReduce is well suited for processing large-scale unstructured information such as all the comments associated with your brand in Facebook, Twitter or other social networks.
Revolution says the cost of deployments depend on the power and capacity of the server, with deployment on a small, eight-core server costing $25,000 per year, including maintenance support. There's no limit on the number of users, but the company says a conservative approach would reserve one core per user, for a total of eight to ten users.
A SAS Comparison
Alpine and Revolution both pick on SAS when the topic turns to pricing. I've heard a few claims about comparable deployments being a fraction of the cost, so I thought I'd go straight to the source. SAS says SAS Analytics Pro, which includes the Base SAS server, SAS/STAT for statistical analysis, and SAS/GRAPH for data visualization, costs $8,000 per year for the first user and $1,710 for each additional user per year. This includes full support. So that's $19,970 for eight users or $23,390 for ten users.
That doesn't sound nearly as high as I would have anticipated, given the competitive claims. It also doesn't jibe with aforementioned Wikipedia-published comparison.
I'm guessing the comment field below will be see a few war stories and "yeah, but" analyses, particularly as the size of the deployments scales into the hundreds or thousands of users.
The choice of software and analysis of cost should be driven largely by the professionals who will be asked to use the products. Let the results, not just initial software cost, justify the selection. Familiarity can be a big productivity advantage. The popularity of R speaks volumes, and so, too, does SAS's market share among commercial software providers.
Alpine Miner is a bit of a different animal; it's purpose-built for big-data deployments. Keep this product's scale of data analysis in mind when trying to develop comparable pricing analyses. The number of supported users is in the same league as the other two yet costs are higher. The difference is that Alpine is running in a big-data MPP environment, not on a low-end server.
Having more options is a healthy sign for the analytics market. You'll have a lot to consider when weighing the cost of expertise and software. If you want to focus on what’s next – the big financial risks ahead, the best customers likely to drive your bottom line, the customers likely to bolt, and the products most likely to sell -- you can't afford not to get into advanced analytics.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.