Software // Information Management
Commentary
6/15/2011
09:23 AM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

Ballmer To IBM, Oracle: You Don't Know Big Data

Microsoft's CEO says an inside-the-enterprise business intelligence focus misses the real opportunity for large-scale insight.

Microsoft is doubling down on "big data," one of the tech trends that's emerging as a top priority for its customers, says Steve Ballmer. But Microsoft's CEO isn't talking about the kind of internal business intelligence that he says are the focus for IBM, Oracle and other rivals.

In a day of interviews with four InformationWeek editors at Microsoft's headquarters last week, Ballmer and several of his lieutenants provided an expansive vision for what big data and the related cloud computing movement will bring. It remains to be seen whether Microsoft can translate that vision into an advantage over rivals IBM and Oracle, or, more importantly, into real value for its customers. But we heard compelling arguments for blending on-premises data and computing capacity with new resources and capabilities in the cloud.

If you think about big data through the narrow lens of large-scale data warehousing, Microsoft is the greenhorn among the likes of EMC (by way of its Greenplum acquisition), Hewlett-Packard (through its Vertica acquisition), IBM, Oracle, and Teradata. Those vendors have fielded products for the top end of the market for years, while Microsoft didn't introduce its SQL Server R2 Parallel Data Warehouse (PDW) database until late last year. Hardware-complete PDW appliances from HP and other partners weren't available until early this year.

In fact, Microsoft didn't win its first PDW customer, the Direct Edge stock exchange, until last month, as I reported in this article. At 30-plus-terabytes, the Direct Edge project is larger than any we've seen on Oracle's Exadata. (BNP Paribas's deployment, which started at 23 terabytes, is the largest Exadata reference customer we know of).

The Direct Edge deployment won't be operational until later this year. And even at its zenith, this project won't hold a candle to the petabyte-scale deployments running on Greenplum, Netezza, and Teradata. Direct Edge says its deployment might scale up to about 200 terabytes.

So just where was Ballmer coming from when he said, "Nobody plays in big data, really, except Microsoft and Google"?

Search And Big Insight

Ballmer's perspective on big data is tied to the Bing Internet search engine, a business we heard much more about from Satya Nadella. Until a few months ago, Nadella was the senior VP in charge of engineering for Microsoft's online business, which includes search (Bing), the MSN portal, and Internet ad-serving. It says something that Nadella was Ballmer's hand-picked choice to take over as president of Microsoft's Server and Tools division in January, following the resignation of Bob Muglia.

Nadella has been at Microsoft since 1992, serving in Microsoft Business Solutions (responsible for Microsoft Dynamics applications) and on the server side of the business (working on Windows NT and other server products). During his four-plus years with Microsoft's online business, Nadella says he "relearned everything about infrastructure," something Microsoft's server business needs to do as it moves into cloud computing.

Microsoft's online operation puts big data into perspective. Bing's infrastructure is comprised of 250,000 Windows Server machines and manages some 150 petabytes of data. Microsoft processes two to three petabytes per day. "You really have to figure out how to process that kind of data to keep your index fresh," Nardella says.

Those interested in running apps in the cloud might dismiss Bing-related processing as being stateless -- not a continuously running component of a mission-critical app. Nadella points to Microsoft's AdCenter, which is a complicated business application with a transactional data store. All Internet ad deliveries have to be tracked, and for every search, Microsoft runs some 30,000 auctions simultaneously to re-rank the ads. "That's as stateful an app as you can get," Nadella says.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest September 18, 2014
Enterprise social network success starts and ends with integration. Here's how to finally make collaboration click.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
The weekly wrap-up of the top stories from InformationWeek.com this week.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.