Xerox Unveils New Document-Management Technology - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Enterprise Applications

Xerox Unveils New Document-Management Technology

The software can "read" an electronic document, decide how it should be classified, then automatically route it correctly.

Scientists at Xerox's Research Centre Europe in Grenoble, France, said Thursday that they've come up with new classification software clever enough to "read" an electronic document, decide how it should be classified, then automatically route it to the right person's E-mail address or an online document-management system.

The unnamed technology--Xerox refers to it as a categorizing tool--is available now and can be licensed by companies that want to incorporate it into existing document systems, as well as by third-party software vendors in the document-management, customer-relationship management, and information-retrieval markets, Xerox said.

The tool, said Eric Gaussier, a researcher at the Grenoble facility, uses a hierarchical model able to understand the dependency between multiple categories, unlike "flat" search-and-retrieval tools that treat each category separately. Biochemistry and biophysics, for example, are closely related--and are treated as such by Xerox's solution--while flat retrieval systems would consider them separate and thus not cross-link documents in each.

The result of this approach, Gaussier said, is faster, better searches, and a virtual hands-off approach to digesting and disseminating digital documents throughout an organization.

In the pilot program that Xerox ran with the Swiss Institute of Bioinformatics, an academic nonprofit foundation, "their traditional search engines for medical articles often presented the most pertinent documents at the end of the list," said Gaussier. "Using our software, they were much more successful at finding what they were looking for, and typically had to browse less than half of the list to find the information."

Xerox's new software, written in Java and suitable for deploying on Unix, Linux, and Windows, is the result of four years of steady work in linguistic modeling, semantics, and machine learning, said Gaussier.

It can be used out of the box by adding it to existing document-management applications created by a company, he added. In that approach, "with a set of categories already established, the software take documents already categorized and using our models, 'learns' how to automatically classify new documents"

In a fresh environment not already equipped with a document management and routing solution, Xerox's tool walks users through the process of creating categories, then classifies documents as part of one or more of those categories.

In either case, the technology is bright enough to learn new categories on its own as it comes across additional documents. "After a while, if the system doesn't cover all the new topics that have emerged, it will tell you where it's not up to date," and dynamically suggest new categories, Gaussier said.

The software can handle documents written in up to 20 different languages, it also serves as an automatic router, shunting categorized documents to the right person--via E-mail attachments, for instance--based on a pre-set user profile that administrators establish. "This can be used, for example, to route incoming mail to the person responsible for a given topic and eliminate mail in your inbox you aren't interested in," Gaussier said. "Imagine clients' complaints going directly to the person responsible for handling them and your E-mail in-box containing only what you're interested in."

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll