Microsoft Azure Adds NoSQL, Search & HBase Services

Microsoft brings DocumentDB, Search, and HDInsight HBase services to Azure, boldly entering the NoSQL arena and rounding out its cloud-based data platform.

Doug Henschen, Executive Editor, Enterprise Apps

August 21, 2014

5 Min Read
A management interface view of the Microsoft Azure DocumentDB NoSQL database service now in public beta.

16 NoSQL, NewSQL Databases To Watch

16 NoSQL, NewSQL Databases To Watch

16 NoSQL, NewSQL Databases To Watch (Click image for larger view and slideshow.)

Microsoft extended its cloud-based Azure Data Platform on Thursday, announcing a NoSQL document database service, a search service, and the general release of an HBase service as part of its HDInsight Hadoop offering.

The biggest news is clearly Azure DocumentDB, which marks Microsoft's entry into the NoSQL market. Microsoft is effectively joining IBM and Oracle, which had previously endorsed NoSQL as a database type built for modern application requirements.

"The demands of mobile and cloud-backed applications are pretty hard on traditional infrastructure," acknowledged T.K. "Ranga" Rengarajan, a corporate VP who oversees Microsoft SQL Server, Azure HDInsight, and Microsoft's Analytics Platform System. "The NoSQL database community has emerged to take into account the different problems of scale, schema evolution, relaxed consistency, and the need to work on multiple devices."

[Want more on Azure? Read Microsoft Azure Machine Learning: Pier 1 Digs In.]

Azure DocumentDB was developed from scratch, drawing on technologies developed by Microsoft Research for the in-memory capabilities introduced in SQL Server 2014, Rengarajan told InformationWeek in a phone interview. The database is designed to bridge the gap between NoSQL and relational databases, offering schema flexibility and scalability but also tunability (for transactional consistency versus performance) and SQL query capabilities, said Rengarajan, a Sybase and SAP veteran who joined Microsoft in 2013.

"We've taken a fresh start with DocumentDB; it's born in the cloud, and it's built with these tradeoffs in mind," he explained. "Sharding and multiple copies [of data] are inherent parts of the database, and we employ an internal component that can execute SQL queries at a per-shard level."

Tunable consistency is not unknown in the NoSQL market, but as the category name suggests, most products in the NoSQL camp offer, at best, a pale imitation of SQL query capabilities.

Azure DocumentDB is entering public preview with Thursday's announcement, but Rengarajan said it has been extensively beta tested. Most notably, for the last three months DocumentDB has managed the metadata behind the Microsoft OneNote applications.

"OneNote is a mobile application backed up by the cloud that lets millions of customers annotate their notes, documents, and pictures," he explained. "The global metadata index of who annotated what and what content is where is now stored in Azure DocumentDB." The sheer scale of this use case is a testament to DocumentDB's scalability.

Microsoft is supporting DocumentDB with libraries and SDKs for popular languages and platforms, including .Net, Node.js, JavaScript, and Python. The database itself is a commercial service, but Microsoft said it will be contributing many of these libraries to open source.

There are no plans, at present, to turn Azure DocumentDB into an on-premises product, according to Rengarajan, but he did not rule out that possibility. The new service would seem to pose the biggest threat to MongoDB, the popular open source NoSQL database, although Microsoft only recently added a MongoDB service on Azure. More than half of MongoDB's customers use the database in the cloud, on Amazon Web Services or other third-party clouds, but they always have the choice of bringing the database into their own data centers with an on-premises deployment.

Oracle entered the modern NoSQL market in 2011 with the Oracle NoSQL database, a scalable key-value store database based in part on the open source BerkeleyDB database Oracle acquired in 2006. IBM has a significant partnership with MongoDB, and early this year it acquired Cloudant, a database-as-a-service (DBaaS) provider that delivers the open source Apache CouchDB as a service (not to be confused with the Couchbase NoSQL database).

Azure Search and HBase services
Azure Search is another service entering public preview on Thursday. As the name suggests, the service is designed to bring search to applications by calling on a cloud-based API. The primary target is mobile applications backed by Microsoft's cloud services.

With Azure Search, Microsoft handles the text-indexing and search-server infrastructure. The service is said to incorporate expected functionality including hit highlighting, faceted search, and common Boolean search expressions. Developers upload content to be indexed and can use a portal or management API to adjust performance and document counts or to tune indexes to influence the relevance of selected terms or content themes.

Microsoft introduced its HDInsight Hadoop service on Azure last fall, but Thursday's announcement marks the general availability of HBase NoSQL database services. HBase stands apart from other NoSQL databases in that it's specific to the Hadoop platform. The biggest demand for the new service is to support content-heavy applications such as capturing user-generated content from social networks or high volumes of meter or sensor data from Internet-of-Things-style applications that combine high scale and a need for frequent updates.

Putting Microsoft's new services into context, Rengarajan said the bigger picture is building a cloud-based data platform with a comprehensive set of capabilities that will enable every person and organization to be informed by data.

"Azure Data Platform customers can combine data from on-premises SQL Server with SQL DB in the cloud, they can capture document data with the NoSQL service, they can run machine learning algorithms over this data with AzureML, and they can visualize data through PowerBI," he said.

In its ninth year, Interop New York (Sept. 29 to Oct. 3) is the premier event for the Northeast IT market. Strongly represented vertical industries include financial services, government, and education. Join more than 5,000 attendees to learn about IT leadership, cloud, collaboration, infrastructure, mobility, risk management and security, and SDN, as well as explore 125 exhibitors' offerings. Register with Discount Code MPIWK to save $200 off Total Access & Conference Passes.

About the Author(s)

Doug Henschen

Executive Editor, Enterprise Apps

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of Transform Magazine, and Executive Editor at DM News. He has covered IT and data-driven marketing for more than 15 years.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights