Few companies on the planet handle bigger big data than Catalina Marketing. And if leading one major technology trend weren't enough, Catalina is spearheading another: delivering real real-time insights--as in, within less than a second of a retail transaction. Not bad for a company with just 1,100 employees and 250 IT workers.
To give you some idea of Catalina's data scale, its primary database holds more than 2.5 petabytes of information and adds data on more than 300 million retail transactions per week. When you check out with a loyalty card at any one of 50,000 grocery, drug, or mass-merchandise retail stores in the U.S., Europe, and Japan (and in many stores, more than 90% of customers use loyalty cards), insight derived from Catalina's database triggers promotions and offers based on your past purchases. The coupons stream out of Catalina's point-of-sale printers at every checkout lane and are handed to customers along with their receipts within seconds of the transactions.
Catalina's customers include manufacturers such as Coca-Cola, Kellogg's, Kraft Foods, and Procter & Gamble; and retailers such as Kmart, Kroger, Ralph's, Safeway, Stop & Shop, Target, and Winn-Dixie. It has been in the coupon and promotions business since 1983, and it has constantly pushed the limits of analytics, data processing, databases, networking, and printing. By 2000, Catalina was managing big data in a custom computer grid built on commodity hardware. In 2003, it became one of the first companies to try the new breed of data warehouse appliances for massively parallel processing, as an anchor customer of Netezza, now owned by IBM.
Netezza is still the company's platform, but Catalina makes a habit of keeping vendors on their toes, regularly reviewing rival products and seeking help on big technical challenges. In 2007, Catalina worked with Netezza and SAS Institute to move scoring of purchase-behavior models into the Netezza database for faster processing. In-database processing was an emerging trend at the time, but that two-year initiative led to the productization of the SAS Scoring Accelerator for Netezza. It also helped put the in-database approach on the map as the way to handle large-scale analytics. Data warehousing platform vendors now compete on the depth and breadth of their in-database processing capabilities.
For Catalina, in-database was the only way to solve a big productivity challenge. Catalina's database is in the same league as the enterprise data warehouses at Walmart and Bank of America; Catalina calls it "the world's largest transaction-level, shopper-data warehouse." It's used to study what consumers buy; the pattern of items they buy together; and variations by geography, market area, chain, store, and ZIP code. Most important for Catalina and its customers, it predicts and reveals the power of promotions, delivered through coupons, to change purchasing behavior.
The prediction part is achieved by modeling against the historical data. By doing that work inside the database, Catalina didn't have to move masses of data into slow-moving analytic servers to score models and reveal their effectiveness in predicting behavior. Models that previously took half a day to process can be scored within 60 seconds. The performance gains, realized in early 2010, let Catalina develop more than 600 models last year with just five data modelers--one fewer modeler than it previously employed to deliver 40 to 50 new models per year. That productivity translates into many more analyses and more accurate and successful coupon offers, the company says.
Without data-driven analyses, redemption rates on coupons are around 1%. With basic targeting, like giving buyers of diet soda or dog food coupons for alternative brands, redemption rates rise to 6% to 10%. Using historical purchase-behavior data and the sophisticated predictive models Catalina employs, redemption rates are as high as 25%.