Big Data // Software Platforms
News
4/7/2014
03:00 PM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Teradata QueryGrid: Beyond Enterprise Data Warehouse

Teradata adds unified fabric access to myriad databases and Hadoop, triggering multiple analysis engines with a single query. Say goodbye to yesterday's enterprise data warehouse ideas.

Teradata, the enterprise data warehouse (EDW) company, announced a QueryGrid data-access layer on Monday that can orchestrate multiple modes of analysis across multiple databases plus Hadoop. It's a next step toward Gartner's vision of a logical data warehouse and an acknowledgement that the notion of the EDW has fundamentally changed.

Teradata has already acknowledged the world beyond the EDW with its Unified Data Architecture, which incorporates the Teradata Aster database for data discovery and Hadoop for varied and voluminous data not well suited to relational database management systems (DBMS). The QueryGrid adds a single execution layer that orchestrates analyses across Teradata, Teradata's Asterdata DBMS, Oracle, Hadoop, and, in the future, other databases and platforms. The analysis options include SQL queries, as well as graph, MapReduce, R-based analytics, and other applications.

"Users don't care if information is sitting inside of a data warehouse or Hadoop, and enterprises don't want a lot of data movement or data duplication," said Chris Twogood, Teradata's vice president of product and services marketing. "The QueryGrid gives them a transparent way to optimize the power of different technologies within a logical data warehouse."

[Want more on the emergence of new analysis techniques? Read Merck Optimizes Manufacturing With Big Data Analytics.]

Offering two-way, Infiniband connectivity among data sources, the QueryGrid can execute sophisticated, multi-part analyses. After finding a segment of high-value customers in Teradata, for example, you could push that subset into Hadoop to explore their sentiments as revealed in Twitter and Facebook social comments. Spotting customers likely to churn -- based on negative sentiments -- you could bring that subset into Asterdata, where graph analysis could be used to spot the most influential customers. Voila: you have a list of high-value, well-connected customers that should be included in an anti-churn campaign.

"There are so many specialized engines, so we want to be able to leverage and integrate those while enabling users of the data warehouse to be able to invoke those techniques," Twogood said.

Teradata isn't the only vendor building what Gartner calls the logical data warehouse. Just last month, SAP introduced its Hana In-Memory Data Fabric for federated data access across sources. And since 2009, IBM has offered its DB2 Information Integrator for federated access to multiple data sources. But where these tools are SQL centric, Teradata's differentiator is broader access to a variety of analysis engines.

With an eye toward heterogeneity, Teradata also introduced Teradata 15 on Monday. This DBMS update adds support for JSON data and goes beyond SQL to invoke applications written in Python, Perl, Ruby, R, and other languages to come.

"This gives you the architectural flexibility to separate the presentation layer and the data-analytics layer," said Alan Greenspan, a product marketing manager at Teradata. "Instead of forcing developers to turn to the data warehouse group to do everything in SQL, they can write their own code and execute it in parallel within the database." The approach avoids data movement, data processing on application servers, and other workarounds between web developers and data-management teams. Greenspan said.

Teradata also announced on Monday an upgrade of its flagship hardware platform. The upgrade offers eight times more memory and three times more solid-state drives per rack than what is delivered in the 6700 series introduced 18 months ago. With 512 gigabytes of memory now available per compute node, Teredata's Intelligent Memory feature can now hold more high-demand data in-memory for lightning-fast, RAM-access-speed analysis.

Introduced in 2013, Teradata Intelligent Memory automatically moves high-demand data to the fastest storage choices available while moving low-demand data to the lowest-cost storage options like high-capacity disk drives. It was a response to SAP's Hana in-memory platform and a preemptive move ahead of the Microsoft and Oracle in-memory options being introduced this year.

The biggest news here is clearly the QueryGrid. After years of preaching that everything should go in the enterprise data warehouse, Teradata is acknowledging and embracing a world in which the EDW doesn't have to be the center of analysis.

What do Uber, Bank of America, and Walgreens have to do with your mobile app strategy? Find out in the new Maximizing Mobility issue of InformationWeek Tech Digest.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
masifabbasi
100%
0%
masifabbasi,
User Rank: Apprentice
5/24/2014 | 4:13:11 PM
Re: Propagating a myth about Hadoop
Very nice article... Thanks for expanding on QueryGrid. Companies like Cisco's composite have built some top-class optimizers that do the similar stuff, but have the edge of 8 years of RnD on Federated queries. Top class rule based optimizers, Intelligent data ships, and an overlay of cost based optimizer makes it really useful for today's myriad of data sets.  Just wanted to respond to some of your comments above:

"DBMS vendors like to say that Hadoop is all about "unstructured" or "semi-structured" information."


I don't think that is what they are saying. Companies like Teradata (IMHO) are saying that Hadoop acting like a data lake is a good place to store structure, unstructured and semi-structured data, which is cold, and near-real-time access isn't a requirement. Hadoop vendors like Hortonworks also agree with the statement.

"Check out my recent profile of Merck & Co., which combined 16 different types of very structured data to figure out why some batches of a vaccine had high yield rates and other batches had low yield rates. The advantage was being able to "dump everything in a lake" without time-consuming data modeling and ETL work."

That is exactly what these companies are proposing. Modelling and ETL aren't required for all use cases in the data management world. We have done POCs where we had done worked with 10-100's of Terabytes of data using a combination of Hadoop & Aster, and acheived results within a couple of weeks.  Essentially these companies are saying that you need a cheap platform to store the data (like Hadoop), but when you want to do number crunching and complex analytics, you need powerful and intelligent SQL, MapReduce and Graph Engines (unless you are Facebook,Google, LinkedIn having access to top-tiers of talent), and use your skillset to acheive the result set.



Having said all that, majority of the problems out there could be solved using a multitude of technologies. The decision to use a particular technology depends/should depend on the following:

1. Urgency of problem (SLA)

2. Your skillset/Strength

3. Skillset available in the market

4. ROI

Thanks for sharing the article BTW :)

 

 
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
4/7/2014 | 5:50:10 PM
Propagating a myth about Hadoop
RDBMS vendors like to say that Hadoop is all about "unstructured" or "semi-structured" information. And their favorite example is Twitter and Facebook comments. That's not the full truth and it's a bit of a left-handed compliment meant to diminish the value of Hadoop. The fact is, Hadoop can handle very structured information that happens to be varied, voluminuous, inconsistent (a.k.a. sparse) or all of the above.

Check out my recent profile of Merck & Co., which combined 16 different types of very structured data to figure out why some batches of a vaccine had high yield rates and other batches had low yield rates. The advantage was being able to "dump everything in a lake" without time-consuming data modeling and ETL work. Within three months Merck was able to cluster and visualize batch yield rates and spot "smoking guns" within 10 years worth of product and manufacturing plant data. The data wasn't actually all that big -- only 1.5 terabytes -- but the ability to bring together a variety of data quickly made all the difference.
In A Fever For Big Data
In A Fever For Big Data
Healthcare orgs are relentlessly accumulating data, and a growing array of tools are becoming available to manage it.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Dec. 9, 2014
Apps will make or break the tablet as a work device, but don't shortchange critical factors related to hardware, security, peripherals, and integration.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.