The Hadoop-ification of the data world bled into this week when Microsoft, quiet save for its SQL Server Parallel Data Warehouse, announced that it, too, would be making big noise about big data. Microsoft added a couple of surprises to its plans for SQL Server 2012, wrote InformationWeek's Doug Henschen on Wednesday: "First, the platform will include big data processing capabilities based on Apache Hadoop. And second, new touch-based data exploration capabilities will be extended to Apple iOS devices."
Microsoft will offer Hadoop on Azure and on Windows Server, but details, including when, are nebulous--known only to some predictive analytics agent that has yet to be invented. What is known: There will be Hadoop connectors for SQL Server and SQL Parallel Data Warehouse, and there will be Hive ODBC drivers for customers using Microsoft's BI tools.
This Big Data is a Big Deal, and while it is more likely futuristic for most companies, the big suppliers (Oracle, IBM, EMC, HP, Teradata, SAP, Microsoft) are jockeying for supremacy already. Many of them are well known for their prowess in the world of structured data, but unstructured data is much more prevalent and much more difficult to manage, let alone to extract meaning from.
"Interest in Hadoop is driven primarily by the need to handle large volumes of loosely or inconsistently structured data such as social network feeds, Web logs, email, documents, and other text-centric information. These data types can be used for applications such as customer sentiment analysis, but they cannot be effectively managed in a relational database such as SQL Server, Oracle Database, or IBM's DB2."
Big data may seem the realm of big business, and to a great extent it is, but its impact is as meaningful as its ambitious moniker suggests. For example, the 2012 political campaigns are starting to use complex data tools (social network data mining, master data management) to gird for the election; CNN recently detailed the intricacies behind the Obama campaign's complex big data efforts for voter outreach.
CNN writes: "The Obama campaign not only has a Facebook page with 23 million 'likes' (roughly 10 times the total of all the Republicans running), it has a Facebook app that is scooping up all kinds of juicy facts about his supporters."
The report talks about the team's use of NationalField, which it describes as a social network tool that lets everyone share their work--not just the typical structured data, but "qualitative information: what points or themes worked for them in a one-on-one conversation with voters, for example. Ups,' 'Downs' and 'Solutions' are color-coded, so people can see where successes are happening or challenges brewing."
At this week's Web 2.0 Expo, LinkedIn's former Chief Scientist, DJ Patil, talked about the company's goal of making big data useful, the role of the "data scientist" and the act of what Patil called "data jujitsu," according to InformationWeek's David Carr. And where science meets martial arts, customers get things like LinkedIn's "people you may know" widget, recruiter recommendations, and job opening recommendations.
While data makes for better products and helps companies gain an edge, it can also save lives. InformationWeek Healthcare's Paul Cerrato wrote earlier this week about how doctors are starting to personalize care based on analyzing clinical data. For instance, medical professionals have begun using this data to individualize treatment protocols, to understand the risk factors that lead to patients being re-admitted to hospitals--and intervene with specialized care to prevent it. They're also vying to increase the success rate of drug treatments for diabetes and coronary patients while also reducing treatment costs.
The Department of Homeland Security is even using big data analysis techniques to help prevent crime. As InformationWeek Government's Elizabeth Montalbano writes: "Specifically, the program--which is only in the preliminary stages of research--is using sensors to 'non-intrusively' collect video images, audio recordings, and so-called 'psychophysiological measurements' such as heart rate, breathing patterns, and eye blinking, that will be analyzed for their association with certain behaviors."
Big data isn't just about super-sizing an Oracle or Microsoft SQL Server database. It's not just "big" data, it's "better" data.
Let the race to get there begin.
Fritz Nelson is the editorial director for InformationWeek and the Executive Producer of TechWebTV. Fritz writes about startups and established companies alike, but likes to exploit multiple forms of media into his writing.
Follow Fritz Nelson and InformationWeek on Twitter, Facebook, YouTube, LinkedIn, and Google+:
- Twitter @fnelson @InformationWeek @IWpremium
- Facebook Fritz Nelson Facebook Page InformationWeek Facebook Page
- YouTube TechWebTV
- LinkedIn Fritz Nelson on LinkedIn InformationWeek LinkedIn Group
- Google+ Fritz Nelson on Google+
Virtualization support, memory, and bandwidth are in, our annual State of Server Technology Survey finds. Download the issue now. (Free registration required.)