Microsoft Azure Does Big Data As A Service

Project Daytona tools let researchers use Windows Azure cloud platform to crunch data, and some tech from archrival Google plays a part.

Paul McDougall, Editor At Large, InformationWeek

July 20, 2011

3 Min Read

Microsoft unveiled new software tools and services designed to let researchers use its Azure cloud platform to analyze extremely large data sets from a diverse range of disciplines--and it's borrowing technology from archrival Google to help gird the platform.

The software maker unveiled the initiative, known as Project Daytona, at its Research Faculty Summit in Redmond, Wash., this week. The service was developed under the Microsoft extreme computing group's Cloud Research Engagement program, which aims to find new applications for cloud platforms.

"Daytona gives scientists more ways to use the cloud without being tied to one computer or needing detailed knowledge of cloud programming--ultimately letting scientists be scientists," said Dan Reed, corporate VP for Microsoft's technology policy group.

To sort and crunch data, Daytona uses a runtime version of Google's open-license MapReduce programming model for processing large data sets. The Daytona tools deploy MapReduce to Windows Azure virtual machines, which subdivide the information into even smaller chunks for processing. The data is then recombined for output.

Daytona is geared toward scientists in healthcare, education, environmental sciences, and other fields where researchers need--but don't always have access to--powerful computing resources to analyze and interpret large data sets, such as disease vectors or climate observations. Through the service, users can upload pre-built research algorithms to Microsoft's Azure cloud, which runs a across a highly distributed network of powerful computers.

Daytona "will hopefully lead to greater scientific insights as a result of large-scale data analytics capabilities," said Reed.

Along with cloud, the development of tools that make it easier for customers to store and analyze big data has become a priority of late for the major tech vendors. Demand for such products is being driven by the fact that enterprises are faced with ever-increasing amounts of information coming in through sources as diverse as mobile devices and smart sensors affixed to everything from refrigerators to nuclear power plants.

Oracle has come to the table with its Exadata system, which can run in private clouds, while IBM spent $1.8 billion last year to acquire Netezza and its technologies for handling big data.

Microsoft sees its Azure public cloud as the ideal host for extremely large data sets. The Project Daytona architecture breaks data into chunks for parallel processing, and the cloud environment allows users to quickly and easily turn on and off virtual machines, depending on the processing power required.

The idea, in effect, is to provide data analytics as a service. Microsoft noted that Project Daytona remains a research effort that "is far from complete," and said it plans to make ongoing enhancements to the service. Project Daytona tools, along with sample codes and instructional materials, can be downloaded from Microsoft's research site.

Data centers face increased resource demands and flat budgets. In this report, we show you steps you can take today to squeeze more from what you have, and also provide guidance on building a next-generation data center. Download it now.

About the Author(s)

Paul McDougall

Editor At Large, InformationWeek

Paul McDougall is a former editor for InformationWeek.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights