informa
/
2 MIN READ
Commentary

Text Technologies in the Legal World

Discovery is a legal process whereby parties to a lawsuit request and provide documents and information that may be pertinent in litigation. "Discovery" also describes an analytics goal that has nothing to do with the court system: extraction of useful information — data, facts, and rules, which together constitute knowledge — from databases and textual sources. Can legal mandates be turned to business advantage?
"Discovery" is a legal process whereby parties to a lawsuit request and provide documents and information that may be pertinent in litigation. "Discovery" also describes an analytics goal that has nothing to do with the court system: extraction of useful information — data, facts, and rules, which together constitute knowledge — from databases and textual sources.

I had expected the December 2006 federal rules amendments on discovery of electronically stored information — "discovery" here in the legal sense — to open new vistas for application of knowledge-discovery technologies: data mining, machine learning, visualization, and the like. The reasoning is simple. Corporations must now retain vast volumes of electronic records including e-mail and information from enterprise operational systems. To comply with e-discovery mandates, they must be able to "produce" records in response to discovery processes, and that means metadata-management, classification, search, and similar systems.Organizations incur huge expense to store and support retrieval of these records. Why not take the next step and mine them, both for litigation support — literally making the case — and for business value?

The message I took away from last week's LegalTech conference was Not So Fast. We're up against centuries of established, formalized practices. Words written in 1996 to describe knowledge discovery in databases, while no longer true for KDD*, still apply in the legal domain: "The traditional method of turning data into knowledge relies on manual analysis and interpretation." It appears that most organizations are still coming to grips with basic automation to ensure legal compliance and are not yet ready to take that next step.

I believe that the legal-sector opportunity for application of text data mining technologies is huge. Conceptual/semantic search, information extraction, clustering for term reduction — would that be like having a sentence commuted? — and document processing, link and association analysis that go beyond tracing e-mail threads: these possibilities, once proven to the satisfaction of the courts, could offer immense benefit to litigators, just as they now do to investigators.

The presence and prominence at LegalTECH of forward-looking vendors — I talked to representatives of Autonomy, EED, Ernst & Young, MetaLINCS, Recommind, Stratify, Zylabs, and others — suggest that the mainstreaming of legal-domain knowledge extraction can't be all that far off.

The opportunity to derive business benefit from retained records is similarly great. Legal mandates can be turned to business advantage. We IT and business folks may yet thank the lawyers.

*KDD now stands for Knowledge Discovery and Data Mining.


Seth Grimes is an analytics strategist with Washington DC based Alta Plana Corporation. He consults on data management and analysis systems."Discovery" is a legal process whereby parties to a lawsuit request and provide documents and information that may be pertinent in litigation. "Discovery" also describes an analytics goal that has nothing to do with the court system: extraction of useful information — data, facts, and rules, which together constitute knowledge — from databases and textual sources. Can legal mandates be turned to business advantage?

Editor's Choice
Samuel Greengard, Contributing Reporter
Cynthia Harvey, Freelance Journalist, InformationWeek
Carrie Pallardy, Contributing Reporter
John Edwards, Technology Journalist & Author
Astrid Gobardhan, Data Privacy Officer, VFS Global
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing