Based on what I saw at the Search Summit, there seems to be a new awareness, at ever-higher levels in the corporate responsibility chain, that in a litigious business environment, "enterprise search" is not just a knowledge-management tactic or a productivity aid, but a survival imperative. You will be sued some day. (It's not a matter of "if," but when.) During the discovery phase of the suit, you're going to provide (and also receive from the other side) bewilderingly immense amounts of data. Without good search technology, sifting through the data isn't just tedious but nightmarishly expensive.I didn't get a chance to attend any e-discovery sessions at the Search Summit, but at lunch, I happened to sit down next to litigation technology consultant (and ESS presenter) Jeff Flax. We had an illuminating chat about search and discovery in the context of records retention.
Flax noted that many companies that have records retention policies aren't following them. He sees a "pack rat" syndrome: a tendency to let expired records remain in the morgue past the "save-till" date. The problem with this is that files that have been declared obsolete or marked for disposition, but have not yet been physically destroyed, are still subject to subpoena. "A good lawyer will ask for expired documents during discovery," Flax notes.
Lawyers are also demanding data in its "native state": Not text dumps or PDFs or other derivative forms of the data, but the data as it actually exists. "If I'm a lawyer and I'm requesting someone's e-mails on a certain subject," says Flax, "I don't want the e-mails as text files, I want the original e-mail archive in binary form so I can pick apart the bits and get at all the header and footer and other information in context."
Sometimes physical media must be handed over in discovery so that deleted files can be detected and recovered. "I've seen cases where browser search queries from many years back, supposedly no longer on disk, have been recovered forensically," Flax told me. "And then certain keyword clumps are detected, and those query patterns can become admissible in court."
It turns out that the data in a search index (the index built by a search engine) can often be used to reconstruct a document even after the document itself has been irretrievably lost. Takeaway: A document can't be considered fully destroyed until you've destroyed its search-index data as well. (I wonder how many retention policies take this into account? Doubtless very few.)
If you're concerned about e-mail retention (and if you're not, you should be), you might want to look into The E-mail Archiving & Management Report 2008 from CMS Watch. You'll find that the report divides vendors (roughly) along three lines: policy-centric, archive-centric, and SaaS-based. (You can see a free sample here.)
My advice? Never pass up a chance to have lunch with a litigation technology expert. You'll be inundated with food for thought.
Kas Thomas is an Enterprise Architecture analyst at CMS Watch. He previously evaluated J2EE and content-related technologies for Novell. Write him at [email protected].Lawyers were well represented (you might say) at last week's Enterprise Search Summit in New York. At times, it felt more like an e-discovery conference with analytics and social-computing side-tracks rather than a search conference featuring a few e-discovery sessions... Without good search technology, sifting through the data isn't just tedious but nightmarishly expensive.