The InformationWeek -- Blogs
Information Management Blog

Topics:   Content Management : Information Management

  • Email this page E-mail this page
  • Print this page Print this page
  • Bookmark and Share
  • icon

Cutting The Digital Fat


Posted by Andrew Conry-Murray, Nov 10, 2008 02:03 PM

Unnecessary copies and older versions of documents clog up employee hard drives and make discovery exercises longer and more expensive. One vendor's software aims to help companies get smarter -- and more aggressive -- about deleting digital fat.


E-discovery forces companies to re-examine their retention and disposition practices. Companies that don't get rid of content as soon as possible will spend more time and money sorting through piles of information, much of it irrelevant, than companies with vigorous disposition policies and processes.

Of course, implementing a vigorous disposition regime is a challenge. A company called NextPage has an interesting approach to that challenge.

NextPage tackles employee hard drives, shared drives, and SharePoint. These areas are often filled with unnecessary copies of existing files, including older versions of finished documents. All these copies and versions add to the pile of information that has to be searched, both by software tools and investigators, in a discovery exercise.

The company sells software to help companies make it easier to identify the final versions that need to be preserved while eliminating older versions and duplicates. The software consists of agents that reside on employees' machines, as well as server software that lets administrators set policies, monitor documents, and take actions, such as deleting files or saving the most recent version to a different repository, such as a content management system.

Here's how it works. The NextPage agent tags new Office documents (Word, Excel, and PowerPoint) as they are created by employees. NextPage calls this tag a "digital thread." It's a metadata stamp that includes a unique identifier, and it remains with the document from creation to deletion. If it can't tag a document, it creates a unique hash value for the document instead.

Once a document is tagged, NextPage can follow the document through its life cycle as it is edited, shared, renamed, and so on. It also can see when a file is attached to an e-mail.

If Employee A e-mails a tagged document to Employee B, and they both have the NextPage agent, the system knows a copy resides on Employee B's hard drive, and will subsequently track any changes that Employee B makes to that file while also associating it with the original file from Employee A.

This is the "thread" that follows the document throughout the enterprise. If a tagged file is sent to a user without the agent, the software notes that it was sent, but won't be able to follow additional changes to the document. However, if that document then comes back to the original sender, the system will pick up where it left off.

People Problems

I think the technology seems fairly straightforward. More problematic is how the system would actually be used in an enterprise.

NextPage says its customers tend to have humans make decisions about the final disposition of content. That's not a surprise -- lots of companies aren't comfortable with automated deletion of content. Human involvement may take the form of a list of older files that gets presented to a user with instructions for getting rid of those files.

But human involvement comes at a price. Employees are reluctant to get rid of older data, regardless of how infrequently they may access it once a project is completed. This means project managers and/or records managers will have to invest time and effort getting users to actually pull the trigger, and following up to make sure users haven't ignored or attempted to subvert disposition requests. The NextPage software can tell if users have saved copies of tagged files to removable media, including disks and thumb drives.

As for implementation, NextPage says its customers typically come from the CIO, CSO, or general counsel's office. These offices generally have the clout to drive a new policy in the organization, but as mentioned, enterprises shouldn't be surprised to find it takes some effort to get users accustomed to getting rid of their files.

Another potential issue is the sheer complexity involved in document tracking itself. As documents get passed around among collaborators and iterations pile up, things will get ugly fast, and the digital threads risk becoming a tangled ball of string. Potential customers should be sure they are comfortable with the management interface for tracking documents.

However, for companies that face multiple lawsuits every year, there's real value in reducing the content haystacks that must be searched during discovery. And while retention and disposition are more easily managed in free-standing content repositories such as e-mail archives and content management systems, there aren't a lot of good options in user land. NextPage is worth a look.

« Apple iPhone Passes The MotoRazr To Be No. 1 Handset In The U.S. | Main | Google's Schmidt Says No To U.S. CTO Post »



Sign Up Now
For InformationWeek News Alerts




This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




 
 

  1. Sequential Programming: Like Eating Peas with a Straw.
  2. Biomolecular device using self-assembled DNA nanostructures?
  3. Coreinfo v2.0: A Simple Utility to Understand the Manycore Complexity in Windows


Join The InformationWeek Group On LinkedIn


                           


  1. More Reasons Why Linux Misses The Desktop
  2. Too Much Netbook For Too Litl?
  3. Verizon: $350 ETF Is A Go
  4. Motorola Explains Why Droid Doesn't Have Multi-Touch


  1. Florida Hospital Dials Up iPhones For Nurses
  2. Is Antivirus Software Dead?
  3. Securing The Cyber Supply Chain
  4. CIO Profiles: Christopher Rence, Chief Information And Business Transformation Officer Of FICO
  5. InformationWeek Analytics Research: Federated Search
  6. Practical Analysis: The Fastest-Growing Security Threat

 

  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag

  DECEMBER 2008
NOVEMBER 2008
OCTOBER 2008
SEPTEMBER 2008
AUGUST 2008
JULY 2008
JUNE 2008
MAY 2008
  APRIL 2008
MARCH 2008
FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007