Microsoft R Updates, IBM's Fraud-Detection Analytics Buy: Big Data Roundup - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
1/17/2016
10:06 AM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Microsoft R Updates, IBM's Fraud-Detection Analytics Buy: Big Data Roundup

In our big data roundup for the past week, we've got news from Microsoft about what it's been doing with statistical modeling language R, IBM's acquisition of a real-time fraud analytics company, Baidu's donation of some machine learning efforts to open source, and a management shakeup at Apache Spark company Databricks.

Twitter's Top Data Science, Analytics, And BI Feeds
Twitter's Top Data Science, Analytics, And BI Feeds
(Click image for larger view and slideshow.)

Microsoft updated its R statistical modeling language product lineup, Yahoo released a massive machine learning data set to the academic community, Baidu released some of its machine learning developments around speech recognition to open source, and IBM acquired a real-time fraud detection and analytics company. We've got those stories and more in this week's big data roundup.

Let's start with Microsoft. It's been a year since the company put a big stake in the ground by acquiring Revolution Analytics, a distributor of the open source R statistical modeling language. Back then, the move was viewed as a way for Microsoft to supplement its growing big data and analytics toolbox as well as to show that it understands the importance of open source. This week, the company announced the rebranding of its R servers and development tools under the Microsoft name, yet it continues its commitment to offering many of those tools for free to the development community.

(Image: PonyWang/iStockphoto)

(Image: PonyWang/iStockphoto)

Meanwhile, another tech company showed that it cares about the development community, too. Yahoo released a massive machine learning data set to the academic research community. This data set includes the surfing and search habits of 20 million anonymous users.

Yahoo's move is designed to be used by researchers for context-aware learning, large-scale learning algorithms, user behavior modeling, and content enrichment. Yahoo said the information includes data about how users interacted with the Yahoo home page, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Movies, and Yahoo Real Estate. The data set is available as part of the Yahoo Labs Webscope data-sharing program, a reference library of datasets composed of anonymous user data for non-commercial use.

[ Do people care about data privacy? Find out what they care about even more. Read Pew: Americans Would Trade Privacy For Safety. ]

The research arm of Baidu, which has sometimes been described as the Google of China, has released some of its machine learning software called Warp-CTC under an open source Apache license and posted it on GitHub. Warp-CTC builds on previous algorithms and was developed as Baidu worked on its Deep Speech recognition system that has been shown to work for English and Mandarin. The company said in an FAQ that it is releasing the development to open source because "we want to make end-to-end deep learning easier and faster so researchers can make more rapid progress. … We want to start contributing to the machine learning community by sharing an important piece of code that we created." Baidu said that it expects to release additional open source AI tools in the future.

IBM announced Jan. 15 that it has acquired IRIS Analytics, a privately held company specializing in real-time analytics for combatting payment fraud. IRIS Analytics is focused on the problem of detecting fraud as it is attempted instead of after it has happened. IRIS provides a real-time fraud analytics engine that leverages machine learning to generate rapid anti-fraud models while supporting the creation and modification of ad-hoc models, IBM said. Financial terms of the deal were not disclosed.

Databricks, the company whose founders developed the widely popular big data platform Apache Spark, has announced a series of top management changes. Ion Stoica is leaving his job as CEO and will assume the role of executive chairman. Current VP of engineering and product Ali Ghodsi has been named as CEO. Patrick Wendell will move into the role of VP of engineering, and Ron Gabrisko has joined the company as SVP of worldwide sales.

Databricks sells and services an implementation of Apache Spark, and these executive moves reflect the 2-year-old company's efforts to get serious about the commercial market and enterprise customers. "As the creators and drivers of the Spark engine, Databricks is at an inflection point where the pace of innovation coming from the community positions us for tremendous growth and opportunity in 2016," Stoica said in a prepared statement. "Ali [Ghodsi] is positioned to enable both Databricks and Spark to seek widespread enterprise adoption, momentum, and customer acquisition."

Data platform analytics company Looker this week announced it has closed a $48 million Series C funding round led by Kleiner Perkins Caufield & Byers, with participation from previous investors, too. The company said it will use the new capital to accelerate its growth through investments in sales, marketing, engineering, and international expansion.

Lastly, digital crowd-sourced encyclopedia Wikipedia is marking its 15th anniversary. To help celebrate the occasion, the folks over at FiveThirtyEight.com have collected the three most edited entries for each year since Wikipedia launched in 2001, which you can see in this article. Spoilers: Many of the highly edited entries are related to big news events for each year, particularly if those events were in any way controversial. For instance, in 2008 the entry most edited was for then US vice presidential candidate Sarah Palin. Wikipedians are also obsessed with tracking deaths, major weather events and systems, popular culture, politics, and "the esoteric and arcane."

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG's Infoworld, Ziff Davis Enterprise's eWeek and Channel Insider, and Penton Technology's MSPmentor. She's passionate about the practical use of business intelligence, ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Commentary
Gartner Forecast Sees 7.3% Shrinkage in IT Spending for 2020
Joao-Pierre S. Ruth, Senior Writer,  7/15/2020
Slideshows
10 Ways AI Is Transforming Enterprise Software
Cynthia Harvey, Freelance Journalist, InformationWeek,  7/13/2020
Commentary
IT Career Paths You May Not Have Considered
Lisa Morgan, Freelance Writer,  6/30/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Key to Cloud Success: The Right Management
This IT Trend highlights some of the steps IT teams can take to keep their cloud environments running in a safe, efficient manner.
Slideshows
Flash Poll