The Big Data Era: How Data Strategy Will Change
From Barnes & Noble to Cabela's to Catalina Marketing, big companies are shifting their approaches.
If you want to understand the challenges of the Big Data era, hang around Catalina Marketing, a global marketing firm that works with a who's who of consumer packaged goods companies and retailers.
Catalina's data warehousing environment shot past the petabyte mark seven years ago and today stands at 2.5 PB. Its single largest database contains three years' worth of purchase history for 195 million U.S. customer loyalty program members at supermarkets, pharmacies, and other retailers. With 600 billion rows of data in a single table, it's the largest loyalty database in the world, Catalina maintains.
More Software Insights
- The Critical Importance of High Performance Data Integration for Big Data Analytics
- Why is Information Governance So Important for Modern Analytics?
- A Smarter Approach: Inside IBM Business Analytics Solutions for Mid-Size Businesses
- Digital Transformation: Creating new business models where digital meets physical
- Take the InformationWeek 2013 Database Technology Survey
- ECM: Solving the Problem of Unstructured Data
At the cash registers of Catalina's retail customers, real-time analysis of that data triggers printouts of coupons that shoppers are handed with their receipt at checkout. Each coupon is unique--two shoppers checking out one after the other, with identical items in their carts, will get different coupons based on their buying histories, combined with third-party demographic data.
Few companies operate at Catalina's scale, but most every company is living in its own version of the Big Data era. Two forces define this era: size and speed. And those forces are driving companies to consider new choices for how they deal with data.
Size is relative--by some estimates, 90% of data warehouses hold less than 5 TB. But it's the pace of growth that has companies rethinking their options. Nearly half (46%) of organizations surveyed last year by the Data Warehousing Institute said they'll replace their primary data warehousing platform by 2012.
Speed is sometimes about pure performance, as in how quickly a system answers a query, but more important is the broad notion of "speed to insight." That's about how much time people--often statistician-analyst-type people--must spend loading data and tuning for performance. The pressure is on IT to get insights out of ever-larger data sets--faster.
This Big Data era got rolling way back in the dot-com days. Since then, a number of alternatives have emerged to challenge the conventional relational databases from Oracle, IBM, and Microsoft. Those options fall into two camps: systems supporting massively parallel processing (MPP), and those harnessing column-store databases.
To read the rest of the article, download a free PDF