Getting Quality Data Now

We ask a group of vendors specializing in helping companies with data quality problems to offer advice on how to avoid "bad data in, bad data out" syndrome.

Jennifer Bosavage, Editor In Chief, Solution Providers for Retail

September 15, 2005

5 Min Read
InformationWeek logo in a gray background | InformationWeek

As anyone who has ever tried to generate any sort of report from a data repository knows, bad data in equals bad data out. The problem is, it's often unclear that data is "bad" or unusable -- until you need to use it.

Getting good, usable data out of your business intelligence takes some planning. That's because ensuring data quality requires different areas of an organization to communicate and come to consensus on certain issues. But it'll be worth the effort: The Data Warehousing Institute estimates that low data quality costs companies $611 billion annually. And Price Waterhouse Coopers has stated in previous reports that low data quality leads to 75 percent of general budget overruns. There's no better time to start cleaning up your act than today -- but it won't be easy.

"Execution is complicated and complex and generally effects every part of organization," says Sam Barclay, vice president of business development at customer relationship management (CRM) firm StayInFront Inc. "Various departments have to agree on formats. The key thing about data quality is getting accurate information in the first place. The problem is that while information has become easier to collect in the last 25 years, that [situation] has not resulted in better data."

StayInFront provides CRM applications, decision support tools and e-business systems. The company combines its technologies with an extensive implementation and support infrastructure, and offers customers the option of either using one or some of its tools, or of implementing a complete solution.

If you are wondering how to start cleaning up your process, be sure you have a clear understanding of your business goals. First off, experts suggest, determine what you are going to need to know from your data. That way, you'll be able to input data that is accurate and complete. Obviously, different departments have different needs. It's not an easy task to build a consensus among diverse users, but if all those concerns are addressed before a single piece of data is input, you will save a world of hurt later.

"Quality is difficult to define, but the user is the one ultimately deciding whether the data is of quality," notes Barclay.

But what if it's too late? What if you've inherited mounds of data that don't seem to correlate, or, worse, that make no sense? Again, you need a plan. What exactly do you need from the information? Full customer names? Phone numbers? Product preferences? Once you have an action plan, you know where to direct your efforts in order to clean and analyze your data. For example, do you need software that can recognize that customer Bob Schwartz of Newtown, CT is the same as customer Robert Schwartz of Newtown, Conn.? The process can be painstaking: "We found that 25 percent of [our clients'] time is spent clarifying bad data," says Tim Furey, CTO of consulting firm Conversion Services.

Next comes the integration stage. Here, you -- or a service you've contracted -- will check the information. For example: Are your phone numbers valid? Are they all input in a consistent manner?

"[Studies have found that] five percent of customer master files have data that is wrong," says Furey. With the cost for each file ranging from $100 to $1000 annually, those are expensive mistakes to maintain.

Once the revisions are complete, you are ready for the augmentation process. Here, you add relevant information to the data. For example, if in stage one the goal identified is to complete customer records so each has a phone number, then that's the data that is searched for and added.

Finally, once the data is complete, consistently input and accurate, monitor it daily. "Only that will tell you if you are meeting your action plan," says Barclay.

It is a far easier -- and less expensive -- venture to put processes into place initially to ensure only clean data enters the system. Unfortunately, many companies are forced to implement a "passive" approach (i.e., extracting the data from the system and fixing it afterward) because they've inherited faulty information. "The cost of a passive approach is 200 percent higher than an active one," says Furey. His company, Conversion Services, offers consulting services focused on data warehousing, business intelligence and data management solutions. Because of the tremendous costs involved, "senior management looks at this now as a strategic issue," he adds.

The impact to the bottom line of having poor-quality data has shaken companies from top to bottom. "Increasingly," says Garry Moroney, CEO of Similarity Systems, "there are more C-level execs involved, particularly in compliance." Similarity Systems offers two products that help organizations identify and correct data quality problems: Axio for profiling, and Athanor for data quality management. Moroney notes that while the data quality process is typically driven by IT, it's crucial to get support at the top.

"The approach we take allows our users to [improve quality] on a gradual basis: Start with auditing data by the BI guys, and then work through," he says. "Increasingly, we see data quality groups with senior-level management sponsorship."

No matter what platform or type of software you start with, the most important concept to remember is that data quality is a process, not an event. Ensuring good, usable information is a process that happens every single day.

"Most important is to make the data trusted," says Furey. "The customer needs to have trust in the data -- or they won't use it."

And what's the good of collecting all that information if you can't use it?

Jennifer Bosavage is a freelance writer based in Huntington, N.Y. You can contact her at [email protected].

Read more about:

20052005

About the Author

Jennifer Bosavage

Editor In Chief, Solution Providers for Retail

Writing and editing from the IT metropolis that is Fairfield County, Conn., Jen is Editor In Chief of Solution Providers For Retail. In her role, she oversees all editorial operations of the site, including engaging VARs to share their expertise within the community. She has written for IT professionals for more than 20 years, with expertise in covering issues concerning solution providers, systems integrators, and resellers.

Jen most recently was Senior Editor at CRN. There, she was in charge of the publication's editorial research projects, including: Solution Provider 500, Fast Growth 100, Women of the Channel, and Emerging Vendors, among many others. She launched the online blog, "Channel Voices," and often wrote on career issues facing IT professionals in her blog, "One Year to a Better Career."

Jen began her tech journalism career at Electronic Buyer News, where she covered the purchasing beat. (That was so long ago that blue LEDs were big news.) Starting as copy editor, she worked her way up to Managing Editor before moving to VARBusiness. At VARBusiness, she was Executive Editor, leading a team of writers that won the prestigious Jesse Neal award for editorial excellence.

Jennifer has been married for 22 years and has two wonderful kids (even the teenager). To adults in her hometown, she is best known for her enormous Newfoundland dog; to high schoolers, for her taco nights.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights