Hamstrung By Defective Data

Business information that's redundant, outdated, or flat-out wrong trips up organizations large and small--but there are fixes in the offing.

Rick Whiting, Contributor

May 6, 2006

2 Min Read
InformationWeek logo in a gray background | InformationWeek

Duplication Dangers
Bank of America has long collected account data in a centralized data warehouse for a variety of marketing and cross-selling applications. The bank's data quality efforts began in earnest in 2002 to comply with the anti-money-laundering provisions in the USA Patriot Act. Data on new accounts is collected in the multiterabyte warehouse from several lines of business, so Bank of America established common practices for capturing, integrating, and managing it, says Donald Carlson, who heads up the bank's anti-money-laundering program and has become its de facto data quality manager.

The bank designated data stewards in business units and the IT department, and some with companywide responsibility. Data quality managers meet monthly to resolve problems. Bank of America uses commercial and custom-built data profiling and matching tools to examine and, when necessary, correct data sent to the warehouse. Today, in addition to regulatory compliance, the bank's data quality efforts are driven by its risk management practices, the need to manage customer data from multiple channels, and cross-selling efforts.

Chart: Database debacles -- Has your company suffered losses, problems, or cost because of poor-quality data?Integrating data from multiple business operations has also been a challenge at Cintas, which created new divisions as it expanded beyond its core employee uniform business into areas such as providing businesses with cleaning supplies and document storage and shredding services. That has resulted in customer data silos throughout the company, database marketing manager Becki Wessel says.

To help with cross-selling, data from all divisions is collected in a data warehouse, but the information is sometimes duplicated with slight variations. Some customers are listed in multiple databases but with enough variation in name or address to be identified as different people. Those discrepancies have sometimes led to existing customers being identified as new prospects--an embarrassing situation when a sales rep shows up. An added danger is that sales reps could begin to distrust leads provided by marketing, Wessel says. Or two customers could be close enough in spelling to be tagged as the same customer, costing the company a sales opportunity.

As part of a project to overhaul its data warehouse, Cintas has been installing quality management software from Dataflux that will identify duplicate customer records and standardize customer data collected monthly from each division's database. The system is expected to be fully functional by next month, but a pilot project already has improved the company's ability to match customer names.

While Cintas is integrating customer data on a monthly batch basis, other companies do so in real or near-real time, which makes data quality even more difficult. More companies also are adding third-party data that may be erroneous or inconsistent. Bank of America's Carlson notes that the globalization of business--and data sources--further complicates the problem.

About the Author

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights