Data Management

In Focus: Evolved Imaging Cuts the Cost of Data Entry

In the seven short years I've reported on information management, I've witnessed a significant evolution in document imaging. Many think of this technology as mature, yet reliability and cost efficiency have improved significantly in recent years, and it remains one of the most widely deployed content management technologies.

InformationWeek Staff, Contributor

February 8, 2005

5 Min Read

Today's faster, better and cheaper document scanners are turning out higher quality images, and storage is hardly the cost barrier it once was. As a result, document imaging broke out of the big-bank, big-insurance, big-government mold years ago, and it is affordable to small and midsize retail chains, manufacturers, healthcare organizations, utilities and financial services companies — even accounting and HR departments dealing with just a few hundred documents per day.

The front end of imaging — the on ramp for paper, as some call it — is the capture software, which is used to index images and/or extract the data needed for the insurance claim, the invoice, the account application or other process. Here, too, the technology is faster, cheaper and far more versatile and powerful than it was seven years ago, automating data-entry tasks with less customization and greater speed and accuracy.

Not so long ago, document indexing and forms processing were handled separately. But today, vendors including AnyDoc, Captiva, Datacap, Doculex, Kofax, Readsoft, Top Image Systems and Verity have combined the requisite recognition technologies and index export tools required to serve just about any need.

Let's say you're processing mortgage loan applications. In the old days, you might have relied on an all-purpose capture module — probably the one bundled with your document management system — to scan and index the incoming applications and supporting documentation. Manual key entry was required to apply index fields for searching, and data entry clerks also had to key in all the detailed information from the application forms to get the loan approval process rolling.

The sophisticates of seven years ago added forms processing systems to cut data-entry costs. But these efforts focused strictly on the loan application forms, applying barcode recognition, OCR (machine print recognition), mark sense recognition (fill-in boxes/bubbles) or even ICR (machine and hand print recognition) to automate the data collection. Entry clerks still had to check over the recognition results, but database lookups, validations and rules helped minimize the editing task. If you were really sophisticated back then, you captured everything in a single workflow without manual presorting of documents and form-ID technology automatically recognized and processed the application forms images within each batch.

Today, you can accomplish all these steps on a single platform, and indexing is often as automated as forms processing. Many capture systems can even classify, index and extract data from document types that vary widely from sample to sample.

Xerox Global Services in France uses so-called "unstructured document" processing technology from Top Image Systems (TIS) to capture some 1,500 letters a day for Club Dial, a direct-marketing music club that does business online at www.clubdial.com (for those of you who speak French).

Club Dial has a classic indexing problem of the kind that used to require manual data entry from each document because no two letters are alike. You just can't predict where the data will appear on the page, and, in fact, the letters come in all different sizes, colors and paper weights, from PostIt notes to letter stock to onion skin to post cards.

Xerox scans all the letters with Kodak scanners that capture color and black-and-white images simultaneously. The TIS eFlow capture software recognizes all text on the bitonal images using ICR technology and then searches for customer numbers, customer names, city names and postal codes. Formatting rules and database lookups aid in the data extraction, and once the indexing is done, Xerox uploads the color images and index files to a Web-based management system so Club Dial employees can resolve customer requests and complaints.

"The capture software automatically spots the client number at least 85 percent of the time, and it finds the name, city or postal code at least 40 percent of the time," says Bernard Beck, a project manager at Xerox. If you can find one of two of these values, he says, you can almost always associate the image with the right customer. "We typically have no more than 10 letters out of 1,500 a day that we can't index automatically."

A small percentage of companies have managed to supplant paper-based processes entirely, through online interactions, straight-through transactions, and electronic and Web-based forms of one type or another. But their ranks are far outnumbered by those who remain in the dark-ages of data entry and paper filing.

Invoices, explanation-of-benefit forms, proof-of-insurance forms, titles, deeds, customer letters and resumes are all examples of documents that are beyond any one company's control, and automated processing applications for these types of documents are now cropping up all over the globe. What once was sophisticated is fast becoming commonplace, and what once demanded huge volumes to justify an investment can now pay off with only hundreds or a few thousand documents per day.

— Doug Henschen, Editor, Managing Content

RESOURCES

a. Bringing Intelligence to the Claims Process
http://www.intelligententerprise.com/showArticle.jhtml?articleID=55301646

b. Toward the Digital Mailroom
http://www.transformmag.com/showArticle.jhtml?articleID=21400200

c. A Sure Shortcut for Paper Processes
http://www.transformmag.com/scan/showArticle.jhtml?articleID=45200017

d. Find previous editions of In Focus here.
http://www.intelligententerprise.com/mcontent/