Data Validation Can't Be Overlooked

Bad data problems can grind business intelligence or data warehouse systems to a halt - or at least they should, because the alternative is inflicting upon users incorrect or inconsistent information... Thus, it is not a bad time for organizations to review their procedures and technology options for data quality, profiling and validation.

David Stodder, Contributor

February 17, 2009

3 Min Read
InformationWeek logo in a gray background | InformationWeek

Bad data problems can grind business intelligence or data warehouse systems to a halt - or at least they should, because the alternative is inflicting upon users incorrect or inconsistent information. Adding urgency is concern today about transparency, which is sure to get even more intense as government and other regulatory bodies attempt to correct defects in financial systems and processes. Thus, it is not a bad time for organizations to review their procedures and technology options for data quality, profiling and validation.In recent years, many tools that specialized in these areas have been consolidated into suites and packages. There are definitely advantages to improving the integration of tools and user interfaces for data quality, profiling and validation - and even bringing these closer to data integration processes.

But, organizations have to make sure that the suites are giving them the right tools for the job: that is, not profiling or data discovery tools when they need something for data validation, or vice versa. Finally, another concern is that in these tough times, organizations can become reluctant to upgrade their suites because it seems too expensive or time-consuming. Then, data quality problems can fester rather than get fixed.

Data validation is one of the more unheralded aspects of ensuring that organizations get good data as they integrate flows from multiple sources. But it is essential: Lack of attention to data validation rules and procedures can be the root cause of security problems, unintended data exposures or any number of business mistakes. Of course, many organizations have been writing data validation rules for a long time, often for each of their data sources and often by hand. Today, however, organizations should be looking at how they can use metadata so that they can develop validation rules and apply them centrally to data coming from a variety of sources.

For Informatica PowerCenter users, a new tool worth evaluating is DataValidator, from DVO Software. I spoke recently with Val Rayzman, the founder and CEO of the company. He described how the product increases automation for validation and data testing. Organizations can calibrate the granularity of these procedures depending on the organization's needs. DataValidator employs its rules against metadata generated and managed by PowerCenter. The tool is designed to manage rules more effectively so there's less confusion and overlap and greater success in reusing the rules over time.

Tools like DataValidator could be a boon for organizations that are under pressure to improve their management of data integration, validation and quality processes as they scale up to handle more data and more difficult update requirements. I will be keeping my eye out for other tools that offer innovations in these areas.Bad data problems can grind business intelligence or data warehouse systems to a halt - or at least they should, because the alternative is inflicting upon users incorrect or inconsistent information... Thus, it is not a bad time for organizations to review their procedures and technology options for data quality, profiling and validation.

Read more about:

20092009

About the Author

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights