Catalina Marketing Aims For The Cutting Edge Of 'Big Data'
No. 5 company in our 2011 InformationWeek 500 ranking is fast, too, delivering insights within a second of a transaction.
Few companies on the planet handle bigger big data than Catalina Marketing. And if leading one major technology trend weren't enough, Catalina is spearheading another: delivering real real-time insights--as in, within less than a second of a retail transaction. Not bad for a company with just 1,100 employees and 250 IT workers.
To give you some idea of Catalina's data scale, its primary database holds more than 2.5 petabytes of information and adds data on more than 300 million retail transactions per week. When you check out with a loyalty card at any one of 50,000 grocery, drug, or mass-merchandise retail stores in the U.S., Europe, and Japan (and in many stores, more than 90% of customers use loyalty cards), insight derived from Catalina's database triggers promotions and offers based on your past purchases. The coupons stream out of Catalina's point-of-sale printers at every checkout lane and are handed to customers along with their receipts within seconds of the transactions.
Catalina's customers include manufacturers such as Coca-Cola, Kellogg's, Kraft Foods, and Procter & Gamble; and retailers such as Kmart, Kroger, Ralph's, Safeway, Stop & Shop, Target, and Winn-Dixie. It has been in the coupon and promotions business since 1983, and it has constantly pushed the limits of analytics, data processing, databases, networking, and printing. By 2000, Catalina was managing big data in a custom computer grid built on commodity hardware. In 2003, it became one of the first companies to try the new breed of data warehouse appliances for massively parallel processing, as an anchor customer of Netezza, now owned by IBM.
Netezza is still the company's platform, but Catalina makes a habit of keeping vendors on their toes, regularly reviewing rival products and seeking help on big technical challenges. In 2007, Catalina worked with Netezza and SAS Institute to move scoring of purchase-behavior models into the Netezza database for faster processing. In-database processing was an emerging trend at the time, but that two-year initiative led to the productization of the SAS Scoring Accelerator for Netezza. It also helped put the in-database approach on the map as the way to handle large-scale analytics. Data warehousing platform vendors now compete on the depth and breadth of their in-database processing capabilities.
For Catalina, in-database was the only way to solve a big productivity challenge. Catalina's database is in the same league as the enterprise data warehouses at Walmart and Bank of America; Catalina calls it "the world's largest transaction-level, shopper-data warehouse." It's used to study what consumers buy; the pattern of items they buy together; and variations by geography, market area, chain, store, and ZIP code. Most important for Catalina and its customers, it predicts and reveals the power of promotions, delivered through coupons, to change purchasing behavior.
Top 10 IT Innovators: InformationWeek 500
Top 10 IT Innovators: InformationWeek 500 (click image for slideshow)The prediction part is achieved by modeling against the historical data. By doing that work inside the database, Catalina didn't have to move masses of data into slow-moving analytic servers to score models and reveal their effectiveness in predicting behavior. Models that previously took half a day to process can be scored within 60 seconds. The performance gains, realized in early 2010, let Catalina develop more than 600 models last year with just five data modelers--one fewer modeler than it previously employed to deliver 40 to 50 new models per year. That productivity translates into many more analyses and more accurate and successful coupon offers, the company says.
Without data-driven analyses, redemption rates on coupons are around 1%. With basic targeting, like giving buyers of diet soda or dog food coupons for alternative brands, redemption rates rise to 6% to 10%. Using historical purchase-behavior data and the sophisticated predictive models Catalina employs, redemption rates are as high as 25%.
Going Real Time
As the in-database project proved, Catalina isn't afraid of pioneering, multiyear projects--as long as a big payoff is in sight. In the company's latest project, it tried to find a technological way around two crucial limitations. First, for bandwidth and local compute-capacity reasons, Catalina's promotions tied to specific store loyalty card holders could only be loaded onto a PC-like Catalina server at customers' "home stores," where they shop most frequently. Trouble is, shoppers often frequent multiple locations of the same chain. Second, Catalina has been limited to pulling data on new retail transactions each night, and the turnaround of new promotional offers downloaded to the in-store servers was two days.
Catalina coupons and promotions reach about 90 million households each year, but between the home-store and two-day latency limitations, the "reach" of the system maxed out at delivering new offers to about 75% of those customers within a four-week period. Three years ago, Catalina finally decided the time was right to change those dynamics. "We had been investigating a new approach for at least 10 years, but the cost of putting high-speed lines and high-performance servers into every store was a barrier," says CIO Eric Williams.
With Moore's Law and bandwidth improvements bringing the pieces into place, the company came up with a centralized-server approach dubbed Catalina Real-Time, or CRT. The CRT server incorporates about $100,000 worth of blade servers, an Oracle database, and networking equipment in a standard 19-inch rack. Deployed at the headquarters of top retail chains, this centralized server then links with the Catalina servers at each store location.
Eric Williams, CIO, Catalina Marketing
CIO Williams: "Targets are going to move, so you need to plan for them to move"CRT captures transactions and delivers coupons in real time no matter which store a customer is shopping in. This makes it possible for Catalina to support multitrip "threshold" promotions that were never before possible. For example, a retail chain or manufacturer might offer $10 off your next shopping trip if you buy 10 products from a specific manufacturer within three months. CRT lets Catalina deliver customer-specific offers and up-to-the minute customer status information to all retail locations. And customers can earn an incentive instantaneously, no matter which store they're in, as soon as they meet a purchase threshold.
Overcoming Obstacles
At one point, Catalina's CRT team had doubts that the project would fly because they were once again pushing the limits of available technology. To speed development, Catalina was trying to adapt a credit-authorization server product normally used by banks and oil companies with many locations. Catalina didn't need to do credit authorizations, but it wanted to take advantage of the store-to-headquarters transaction-posting functionality. That software normally runs on big iron servers, but Catalina wanted to run higher-than-normal transaction volumes on a cheaper class of commodity blade servers. Early performance specs fell far short of expectations.
Initial tests were running at fewer than 100 transactions per second, where they needed to exceed 300. "We knew we didn't have a business model unless we could get it to run faster on a Linux-based platform using low-cost blade servers," Williams says.
Catalina's developers cracked open the software and stripped out every scrap of unnecessary code tied to credit authorizations. That worked, and the team eventually pushed system performance up to about 450 transactions per second. It was a benchmark required to support retailers with thousands of locations and dozens of checkout lanes per store.
By the end of this year, Catalina expects to deploy CRT across 22 of its top retail customer chains, accounting for about 85% of Catalina's retail transactions.
Driving The Business
Big data still needs to be fast
Big data still needs to be fastTechnology is obviously hugely important to Catalina, and the employee numbers bear that out: Nearly half of the 500 people at headquarters work in IT. "We are a technology company, and IT is responsible for the innovation of the business," says Williams, who looks to hire experienced, versatile team players. "I'm not looking for people who have worked in large development shops and are used to working in silos of development," he says. "Our teams are cross-functional, and they have to handle whatever needs to be done."
Catalina typically assigns what Williams describes as "agile" teams of 12 to 15 people to projects such as the in-database analytics and CRT ones. Goals change fast. "I'm amazed at how many people in technology like to have clearly defined requirements," he says. "We're a sales company, so I guarantee you the targets are going to move, and you need to plan for them to move."
Catalina IT is akin to a product development team, as Williams describes it, working with brand development people to explore what's possible. "My team just loves that," he says, "because they can take on these projects and then go see them really work out there as a consumer."
Go to the 2011 InformationWeek 500 homepage
About the Author
You May Also Like