Big Data // Big Data Analytics
01:27 PM
Doug Henschen
Doug Henschen
Connect Directly
Repost This

5 Big Wishes For Big Data Deployments

Big data project leaders still hunger for some key technology ingredients. Starting with SQL analysis, we examine the top five wants and the people working to solve those problems.
5 of 6

Wish 4: Real-time Analysis Options
Another item on the big-data analytics wish list is real-time performance. Two startup vendors going after this opportunity are marketing analytics vendor Causata and real-time Hadoop-analysis vendor HStreaming.

For Causata, "real time" means making decisions in under 50 milliseconds. You need that kind of speed to change content, banner ads and marketing offers while your customers are still active on websites and mobile devices. Causata uses Hadoop's HBase NoSQL database for storage or marketing-related data that might include clickstreams, campaign-response data and CRM records. HBase isn't good at real-time querying, however, so Causata runs Java-based algorithms on a proprietary query engine to improve performance.

As its name hints, HStreaming relies on stream-processing technology that's similar to the event-processing engines used by financial trading operations and offered by IBM (InfoSphere Streams), Progress Software (Apama), SAP (Sybase Aleri), Tibco (Complex Event Processing) and others. HStreaming takes data directly from always-on sources such as video surveillance cameras, cell towers and sensors, and spots patterns in that data while it's still in flight. The technology also provides a form of extract, transform, load (ETL) for then storing the data onto Hadoop for later analysis. HStreaming cites video surveillance, network optimization and mobile advertising as its top applications. In all three cases, real-time insight and action are a must.

Taking a different tack, Hadoop software and support vendor MapR has announced a partnership with Informatica through which it claims it will become the first and only Hadoop software distributor capable of delivering near-real-time data streaming on the big-data platform. MapR's Hadoop distribution features a lockless storage services layer that works hand-in-hand with Informatica messaging software to continuously stream massive amounts of data into Hadoop. Couple this capability with a coming SQL-on-Hadoop option such as MapR-favored Drill, and you'll have yet another option for fast big-data analysis.


Oracle Cuts Big Data Appliance Down To Size

Inside IBM's Big Data, Hadoop Moves

MongoDB Upgrade Fills NoSQL Analytics Void

10Gen Enterprise Release Takes MongoDB Uptown

Will Microsoft's Hadoop Bring Big Data To Masses?

6 Big Data Advances: Some Might Be Giants

Hadoop Meets Near Real-Time Data

Big Data Analytics Masters Degrees: 20 Top Programs

Big Data's Surprising Uses: From Lady Gaga To CIA

13 Big Data Vendors To Watch In 2013

Big Data Talent War: 7 Ways To Win

Teradata Joins SQL-On-Hadoop Bandwagon

5 of 6
Comment  | 
Print  | 
More Insights
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.