While classifying information is a challenge, finding it often proves an even higher hurdle.
Major data stores, such as network-attached storage filers or e-mail archives, are the low-hanging fruit. Storage administrators generally know where they are. But other data stores are trickier. SharePoint servers, for example, are relatively easy to deploy, which means line-of-business managers can set up one or two on their own, without IT's permission or knowledge. After a recent audit, one of HP's bank customers found more than 5,000 SharePoint implementations it wasn't aware of, says Jonathan Martin, chief marketing officer for HP's information management software group. Those servers likely hold information that falls under a retention and disposition policy.
Online collaboration tools--such as Socialtext, PBwiki, and Google Docs--are another area of concern. Users can upload business content to these sites in seconds with IT none the wiser, and the data moves beyond the reach of classification and disposition systems. Proactive IT organizations will provide sanctioned collaboration tools that blend administrative controls, such as provisioning, deprovisioning, and authorization, with the ease of use of Web 2.0 apps. This way, you can ensure that content created in these collaborative environments can be discovered--and destroyed--in accordance with policy.
Just as significant are user desktops and laptops. User hard drives are chock-full of corporate data, as are portable flash drives and other removable storage media.
So what's to be done? For user devices, agents are a good answer. EMC talks about using its RSA Data Loss Prevention agents, which are deployed on endpoints and can find and identify content, for information management. These agents are focused mainly on enforcing use policies, such as preventing certain kinds of information from being attached to an e-mail or saved to a removable drive. But the classification capability may be repurposed to also ensure that information on user endpoints meets retention policies. Backup agents could play a similar role. These agents already are copying data from local machines to be stored on backup servers, so they're naturals for legal discovery and retention and disposition purposes.
No vendor has yet made product or road map announcements to this effect, but as HP's Martin says, "It's a natural evolution that organizations want to leverage the investment they've made in backup for more than just simple recovery."
Data disposition has clear benefits for IT and for the business. A sound disposition policy will help enterprises reduce storage costs and reclaim disk space. The tools needed to find and classify data can be leveraged as part of an information management strategy. Regular purging also will reduce discovery costs in the event of litigation. It's shredding time.
Illustration by Sek Leung
Records Retention: Practice What You Preach