Software // Information Management
Commentary
4/6/2010
06:13 PM
Curt Monash
Curt Monash
Commentary
Connect Directly
RSS
E-Mail
50%
50%
Repost This

The Case for Retaining Everything

I'd like to reemphasize a point I've been making for a while about data retention: As costs go down, the wisdom of keeping detailed data goes up. I'd go so far as to say that every piece of data generated by a human being should be preserved and kept online...

I'd like to reemphasize a point I've been making for a while about data retention:

As costs go down, the wisdom of keeping detailed data goes up. I'd go so far as to say that every piece of data generated by a human being should be preserved and kept online, legal and privacy considerations permitting.* Most forms of capital-, labor-, and/or location-based competitive advantage being commoditized and/or globalized away. But information remains a unique corporate asset. Don't discard it lightly. *Unless there's an explicit law mandating data destruction, legal considerations should permit. The idea "Let's destroy something of irreplaceable value today, against the possibility we might be brought to judgment tomorrow" is both morally and pragmatically weird. Privacy, however, may be a different matter.That applies to the structured/tabular kinds of data I tend to focus on in this blog. It applies even more to anything that's like a document (or email, instant message, whatever) somebody has taken the trouble to place into words. A top document-oriented archiving analyst (and my good friend), David Ferris, quite agrees. As David puts it:

I think we'll end up archiving everything, except egregious garbage like spam:

  • It's too hard to get users to conform to policy.
  • Automated methods of capturing a human-understandable policy, for example "tax records," are too hard to implement through automatic filters. The filters are too inaccurate.
  • It's impractical to get users to classify everything, and automatic classification is too crude.
  • You never know what you might want later. Stuff you think you won't want now may end up being very useful.
  • The cost of storage is trivial when looked at on a per-user basis.
In particular, I think information destruction is a crude instrument for the protection of privacy, wasteful at best, and likely to be vigorously resisted by governments and large businesses. For example:
  • Businesses are increasingly subject to retention-oriented compliance regulation. Your lawyers may want you to destroy information that could be used to sue you, but governments won't let you.
  • Information about individuals' web surfing is being retained, under law, so that they may be fingered later for pornography consumption or illegal file sharing. I deplore some of the ways web-surfing data can be and is being used, and want laws passed to rein them in. But the retention will happen.
  • Marketers want all that data. Duh.
  • Electronic health records are coming -- slowly, but they'll get here some day.
Besides, archiving technologies are getting ever more cost-effective.I'd like to reemphasize a point I've been making for a while about data retention: As costs go down, the wisdom of keeping detailed data goes up. I'd go so far as to say that every piece of data generated by a human being should be preserved and kept online...

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.