IBM And Big Data Disruption: Insider's View

IBM's Bob Picciano, general manager of Information Management, talks up five big data use cases, Hadoop-driven change; slams SAP Hana, NoSQL databases.
Disrupting Legacy

IW: You do acknowledge that Hadoop is emerging, but is IBM committed to bringing that platform to enterprises even if it might displace legacy data warehouse workloads?

Picciano: Hadoop will displace not just some aspects of data warehouse work, it will create disruption in the field of ETL as well.

IW: And also mainframe processing. So is IBM really going to champion Hadoop if it might displace data warehousing, ETL and mainframe workloads?

Picciano: Yes, although I would be careful to define the legacy businesses. One of the biggest businesses around the Z mainframe is around Linux and workload consolidation. As we run Hadoop on Linux, there's an opportunity to have that workload in a Z environment. In fact, we've announced the ability to put our BigInsights engine on ZBX, which are Z blades inside of a Z enterprise cluster.

IW: What's the advantage of that approach? Isn't one of the most notable benefits of Hadoop taking advantage of low-cost commodity hardware?

Picciano: It's about handling a diversity of workloads in one environment. If you consider that Z is the system of record in most institutions, why wouldn't they also want to be able to get faster, real-time analytic views into that information? Right now companies have to move that data, on average, 16 times to get it inside a tier where they can do analysis work. We're giving them an option to shorten the synapse between transaction and insight with our IBM DB2 Analytics Accelerator (IDAA).

It makes perfect sense to do that with a data warehouse, and we're having great success where organizations are looking at their Teradata environments in comparison to the efficiency of putting an IDAA on Z. They're saying, why am I sending that data all the way over there to that expensive Teradata system? When you send queries to DB2 when the IDAA is attached, it figures out whether it's more effective to run the query with Z MIPS or whether to run it the IDAA box.

IW: So you're talking about running Hadoop on mainframe, but is that evidence that IBM is willing to disrupt existing business and be an agent of change?

Picciano: If you look at our company's history, especially in the information management space, we started with hierarchical databases but we were the agent of our own change by introducing relational systems. We introduced XML-based systems and object-relational systems. Some of them had more traction than others and some of them fizzled out and never really produced much.

We think there's real value for our clients around Hadoop and data in motion. In some ways that disrupts the data warehousing market in a new way in that you're analyzing in real-time, not in a warehouse. That's very threatening to storage players because you're intelligently determining what patterns are interesting in real time as opposed to just trying to build a bigger repository. We're doing this not because we think it's intellectually stimulating but because it's valuable to customers.

IW: Is there a poster-child customer where mainframe or ETL or DB2 workloads have dramatically changed because IBM is helping them reengineer?

Picciano: General Motors is an example where CIO Randy Mott is transforming and bringing IT back into the company. He's doing that utilizing a Teradata enterprise data warehouse and a new generation of extract-load-transform capabilities using Hadoop as the transformation engine. IBM BigInsights is the Hadoop engine and we're taking our DataStage [data transformation] patterns into Hadoop.

IW: Upstarts are making claims about how big data is changing enterprise architectures. It makes you wonder who's driving the trends.

Picciano: I think IBM is driving, and the reason is this architecture that I've talked about where you have different analytical zones that are really effective at certain aspects of the big data problem. You can't look at it through a purist lens and say, "Hadoop will be able to do all these things," because it just cannot do all those things.

IW: There's clearly a role for multiple technologies. The question is how technology investments will change and how quickly they'll change?

Picciano: Customer value has to be in the center of everyone's cross hairs. It's not a technology experiment or a science project. The use cases that I talked about are where customers are getting additive value because they're analyzing operational data that they couldn't analyze before. They're getting a different view of their clients that wouldn't have been economical to build and so on... When you look at what's required in each of those zones, IBM has a leadership stake in all of those areas and we're putting vigorous investment even into areas that may appear to be most disruptive, like Hadoop.

Editor's Choice
Cynthia Harvey, Freelance Journalist, InformationWeek
John Edwards, Technology Journalist & Author
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing