5 Big Wishes For Big Data Deployments - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
01:27 PM
Doug Henschen
Doug Henschen
Connect Directly

5 Big Wishes For Big Data Deployments

Big data project leaders still hunger for some key technology ingredients. Starting with SQL analysis, we examine the top five wants and the people working to solve those problems.
1 of 6

5 Top Wishes For Big Data Deployments
If you've even experimented with building big-data applications or analyses, you're probably acutely aware that the domain has its share of missing ingredients. We've boiled it down to five top wants on the big-data wish list, starting with SQL (or at least SQL-like) analysis options and shortcuts to deployment and advanced analytics and finishing with real-time and network analysis options.

The good news is that people and, in some cases, entire communities, are working on these problems. There are armies of data-management and data-analysis professionals who are familiar with SQL, for example, so organizations naturally want to take advantage of knowledge of that query language to make sense of data in Hadoop clusters and NoSQL databases -- the latter is no paradox, as the "No" in "NoSQL" stands for "not only" SQL. It's not a surprise that every distributor of Apache Hadoop software has proposed, is testing, and has or will soon release an option for SQL or SQL-like analysis of data residing on Hadoop clusters. That group includes Cloudera, EMC, Hortonworks, IBM, MapR and Teradata, among others. In the NoSQL camp, 10Gen has improved on the analytics capabilities within MongoDB, and commercial vendor Acunu does the same for Cassandra.

Deploying and managing Hadoop clusters and NoSQL databases is a new experience for most IT organizations, but it seems that each and every software update brings new deployment and management features expressly designed to make life easier. There are also a number of appliances -- available or planned by the likes of EMC, HP, IBM, Oracle and Teradata -- aimed at fast deployment of Hadoop. Other vendors are focusing on particularly tricky aspects of working with Hadoop framework components. WibiData, for example, provides open-source libraries, models and tools designed to make it easier to work with HBase, Hadoop's high-scale NoSQL database.

The whole point of gathering up and making use of big data is to come up with predictions and other advanced analytics that can trigger better-informed business decisions. But with the shortage of data-savvy talent in the world, companies are looking for an easier way to support sophisticated analyses. Machine learning is one technique that many vendors and companies are investigating because it relies on data and compute power, rather than human expertise, to spot customer behaviors and other patterns hidden in data.

One of the key "Vs" of big data (along with volume and variety) is velocity, but you'd be hard pressed to apply the phrase "real-time" to Hadoop, with its batchy MapReduce analysis approach. Alternative software distributor MapR and analytics vendor HStreaming are among a small group of firms bringing real-time analysis of data in Hadoop. It's an essential step that other vendors -- particularly event-stream processing vendors -- are likely to follow.

Last among the top five wishes for big data is easier network analysis. Here, corporate-friendly graph-analysis databases and tools are emerging that employ some of the same techniques Facebook uses at truly massive scale. Keep in mind that few of the tools and technologies described here have had 30 or more years to mature, like relational databases and SQL query tools have. But there are clear signs that the pain points of big-data management and big-data analysis are rapidly being addressed.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 6
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Remote Work Tops SF, NYC for Most High-Paying Job Openings
Jessica Davis, Senior Editor, Enterprise Apps,  7/20/2021
Blockchain Gets Real Across Industries
Lisa Morgan, Freelance Writer,  7/22/2021
Seeking a Competitive Edge vs. Chasing Savings in the Cloud
Joao-Pierre S. Ruth, Senior Writer,  7/19/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Monitoring Critical Cloud Workloads Report
In this report, our experts will discuss how to advance your ability to monitor critical workloads as they move about the various cloud platforms in your company.
Flash Poll