Accuracy and Precision - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
07:43 PM
Connect Directly

Accuracy and Precision

Analytic accuracy and precision will make or break real-time decision-support systems.

With growing demand for real-time decision support, analytic accuracy and precision are more important than ever. Systems are increasingly embedding analytics and eliminating immediate human oversight from operational processes in the name of speed, efficiency, and economy. They monitor and respond directly to dynamic business conditions, relying on automated measurement, classification, prediction, and execution. Highly automated systems typically do not deal well with incertitude so they had better work correctly — accurately and precisely — from measurement to action.

Techniques to improve accuracy and precision are more widely understood than applied, perhaps because they're often proposed out of sensible context, sold as ends in themselves rather than as one means toward a goal that is influenced by many factors. And when they are applied, they seem to be perceived as magic bullets that on their own, in isolation, will target and solve business problems. Meanwhile the race toward profitability through automation and disintermediation continues, only increasing needs.

With the hope of providing useful perspective, I'll devote this column to discussing a number of accuracy and precision techniques. There's no magic bullet, however, because context and actual requirements are key. We'll start with data quality.

Garbage In

You know the old saw, "garbage in, garbage out." The implication is that data quality is an absolute, a "must have," an end in itself without regard for actual needs that can be met by realistic but limited steps.

Take a hypothetical direct-marketing firm, where nine addresses out of 100 are undeliverable and three are undetected duplicates due to variant spellings of names or addresses. These errors could be tolerable if the cost of correcting them is greater than the estimated value of the 9 percent missed opportunity and the cost of delivering three percent duplicates. The cost of absolute quality may be higher than the return. Spam e-mail is at one extreme of the spectrum, where the cost of sending an email message is so low that a spammer can send to a dictionary's worth of addresses at a known domain, with absolutely no discretion in targeting, in the hope of a small number of sales.

The U.S. population census is at the other extreme: The government interprets the Constitutional mandate to perform an enumeration of the population to mean that it must make a significant effort to count every individual without recourse to statistical adjustment for missed or duplicated individuals.

Because the government cannot meet its accuracy goals through statistical techniques, it is making a huge effort to improve its TIGER geographic database and its Master Address File in order to target survey forms more precisely. But responses are not verified and may nonetheless be nonfactual. For instance, I might identify myself as Aleut and the output tables will dutifully relay my self-reported but incorrect ethnicity. There's no formal data quality problem here, nor is there when someone gives fictitious personal information when signing up to access a Web site. Data quality is important — but only to the extent that efforts don't overshoot the required precision. It is insufficient to ensure accuracy in the face of data issues not related to quality.

Processes and Models

The government conducts an accuracy and coverage evaluation of the census, an independent survey to assess the source and extent of error, but judged that statistically adjusting 2000 census results to improve accuracy would likely introduce errors greater than those adjustments would eliminate. Fixing results may not improve their accuracy because of limitations in the sensitivity of the measurement instrument and of the correction techniques, reinforcing the importance of designing and building accuracy into processing systems. Do a job right, and you won't have to correct the results.

The twin trends of business performance and process management are positive steps given the central role they assign to process manageability and measurability. Both entail modeling organizational dynamics with built-in assessment and decision points and to accommodate evaluation of alternative scenarios. They differ in that one emphasizes process-quality measurement, monitoring, and optimization and the other testable, repeatable, intentional (rather than haphazard) processes.

The long-established software-testing concepts of verification and validation (V&V) come into play on this larger, organizational scale. Verification seeks to show that a model or algorithm is implemented correctly while validation is the determination that you've chosen the right model and algorithms for the problem at hand. For example, in the past, I've picked on analytic software that provides only linear data fitting, which is useless if you're trying to detect and predict periodic effects like seasonality. The programming might be verified as completely corrected but the results will be wildly inaccurate due to the inadequacy of the model applied. Organizational process models need V&V just as much as software programs to ensure that models are apt and well coded. Good-quality modeling, with the added dimension that process models should adapt to changing circumstances, is important for quality results.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

How CIO Roles Will Change: The Future of Work
Jessica Davis, Senior Editor, Enterprise Apps,  7/1/2021
A Strategy to Aid Underserved Communities and Fill Tech Jobs
Joao-Pierre S. Ruth, Senior Writer,  7/9/2021
10 Ways AI and ML Are Evolving
Lisa Morgan, Freelance Writer,  6/28/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Flash Poll