Big Data. Big Decisions
InformationWeek
Special Coverage Series


Microsoft Adds Big Data To SQL Server 2012

Apache Hadoop will power Azure cloud services and Windows deployments. New BI tools support touch-based data exploration--including on Apple devices.

Slideshow: Yahoo's Hadoop Implementation
Slideshow: Yahoo's Hadoop Implementation
(click for larger image and for full slideshow)
Microsoft announced Wednesday that the next major release of its database platform will be called SQL Server 2012. The release was expected and the name is no shocker, but Microsoft added a couple of surprises. First, the platform will include big data processing capabilities based on Apache Hadoop. And second, new touch-based data exploration capabilities will be extended to Apple iOS devices.

Formerly code named Denali, Microsoft's next SQL Server release has been discussed and available as a community technology preview for several months. The company announced at the PASS Summit 2011 event that the database platform will become generally available in the first half of next year. Again, that could have been predicted.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

Microsoft's public embrace of Hadoop comes just one week after rival Oracle announced that it, too, will release a distribution of Hadoop and will put the software on a Big Data Appliance built on Oracle Sun hardware. EMC announced a distribution of Hadoop in May and it followed up last month by announcing a modular appliance than can run the Greenplum database and Hadoop on the same platform.

Interest in Hadoop is driven primarily by the need to handle large volumes of loosely or inconsistently structured data such as social network feeds, Web logs, email, documents, and other text-centric information. These data types can be used for applications such as customer sentiment analysis, but they cannot be effectively managed in a relational database such as SQL Server, Oracle Database, or IBM's DB2.

[ Want more on Oracle's big data plans? Read Oracle's Big Plans For Big Data Analysis. ]

"We're seeing significant changes in the data landscape, with businesses encountering more types of data--more shapes, more sizes--than ever before; to address those changes we need a new data platform," Doug Leland, general manager of product management for SQL Server, said in an interview with InformationWeek.

Microsoft will support Hadoop with an Apache-derivative distribution that will run as a service on the company's Azure cloud platform and an on-premises release that will run on Windows Server. The Azure service will debut in beta by the end of this year while the software release will follow in 2012, though the company didn't specify which quarter or even which half of next year.

Running on Windows will be a new trick for an open source platform that has heretofore run on Linux. Will Microsoft's release be free and open source? That has yet to be announced, Leland said, and there was no word on whether there would be supporting appliances on third-party hardware, as there are for the SQL Server Parallel Data Warehouse.

Leland did note that the software will be "consistent and compatible with the Apache Hadoop core." He also noted Microsoft has partnered with Hortonworks, a Yahoo Spinoff that specializes in Hadoop, to help develop the software distributions and propose contributions back to the Hadoop community.

Microsoft will give customers several ways to exploit data from Hadoop. Available immediately will be final versions of previously announced Hadoop Connectors for SQL Server and the SQL Server Parallel Data Warehouse. These connectors will enable data to be passed between SQL and Hadoop, but data is more likely to be passed from Hadoop into SQL, so the results of big-data processing jobs on Hadoop can be analyzed with familiar SQL analysis tools.

The coming software distributions (including the beta Azure service due out by year end) will add a Hive ODBC driver that will enable customers to use Microsoft's familiar business intelligence (BI) tools to analyze data directly within Hadoop. Hive is the Apache query and analysis tool for Hadoop.

SQL Server's BI capabilities will be enhanced significantly in the 2012 release by the addition of Power View, formerly code named "Crescent." Microsoft Senior Vice President Ted Kummert was expected to demonstrate the Power View data exploration and visualization capabilities on Wednesday on Apple iOS devices, including the iPad. The Power View touch capabilities won't be available until the end of 2012, by which time there just might be Windows 8-based tablet competitors. But given iPad's tablet dominance and the current lack of credible competition running on Windows, supporting iOS was a long-overdue choice Microsoft had to make.

To complement the Windows Azure Data MarketPlace, Microsoft also demonstrated a new Data Explorer tool that will make it easier to browse and use data from public-cloud data sources. The MarketPlace is accessible in 26 countries, and it offers hundreds of data sets including financial data, demographic data, and geospatial data. Data Explorer includes data-visualization components for browsing data and extract-transform-load capabilities for enhancing your own data with purchased data. Resulting new data sets can also be uploaded back to the MarketPlace for sale or free distribution.

Microsoft, like Oracle, has given itself lots of time to deliver software and services for on-premises Hadoop deployments. Microsoft's commitment to have a beta Hadoop service up and running on Azure by year end is a bit more exciting (although such services are already available on Amazon's cloud). The Hive ODBC tool and Hadoop connectors for SQL Server promise to make Hadoop accessible. It's likely that 99% of SQL Server customers will be more interested in Data Power and conventional database capabilities in the near term. But with two database giants now embracing Hadoop, it's very clear unstructured data processing and analysis will eventually go mainstream.



Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.