An information life-cycle management strategy is key to addressing expanding data volumes and regulations for data archiving and retrieval.
Information life-cycle management holds great promise for helping enterprises manage data storage from the time information is created until it's destroyed. Unfortunately, ILM isn't a product you can buy. Rather, it's a constellation of policies, processes, services, and products designed to meet business goals for the data.
Those goals are influenced by a variety of factors, including the value of the data, whether there are regulatory imperatives attached to it, and whether it will need to be retrieved for legal discovery or some other reason.
This means data items--such as database records, e-mails, and Word documents--should be stored on the most appropriate medium, depending on the data's value. Ideally, the ILM system will recognize the data's business value and direct it to the appropriate storage medium. The medium will change as the data's value changes over time. For instance, last quarter's financial statements will likely be easily accessible on a disk array. Financial statements from 15 years ago will likely be on tape at an off-site storage facility.
Because an ILM system aims to move data to the most appropriate medium based on the data's value, this implies a tiered storage environment with different costs, security levels, and redundancy requirements. The tiered architecture then requires a migration engine to move data among tiers. And the third element, the classification engine, is the linchpin of the system--and the most challenging.
The job of the classification engine is to determine the business value of a data set at a given time. But how do you measure value? One metric is the potential cost incurred if the data is lost or not available. This metric has a time element associated with it. For instance, if CAD files for a new project are unavailable for a few hours, the cost will be less severe than if recovery of those files takes a day or a week.
Other measures include the sensitivity of data. For instance, companies may face fines, lawsuits, and the loss of customer trust if they lose or expose credit card numbers. Another important measure is whether data may have to be produced in response to an e-discovery request or an audit. For instance, the Federal Rules of Civil Procedure require that litigants produce e-mail and other electronic files related to a lawsuit within 90 days. Companies without a clear strategy will find it difficult to meet such requirements.
ILM is more a management strategy than a technology, so specific tools are required to implement data classification and migration. No one tool provides those elements across unstructured, structured, and semistructured data.
Unstructured file data, from Microsoft Office documents to medical images, is being addressed by emerging ICM (information classification and manage- ment) tools that can estimate a file's value from existing metadata and content, then migrate files to new locations based on that value. Data in relational databases can be managed by database archiving tools, and e-mail archiving tools can help manage the reams of content generated by users' in-boxes.
There are four main drivers for an ILM strategy: regulatory compliance, e-discovery assistance, operational performance improvements, and productivity increases.
Enterprises face a variety of regulations and compliance mandates, such as Sarbanes-Oxley and the Health Insurance Portability and Accountability Act, that require data to be stored for specific lengths of time. Other requirements include SEC 17a-4, which requires that securities broker/dealers record customer communications and store them in a nondeleteable, nonmodifiable form. An ILM strategy can help manage these requirements and ensure that data is stored as long as mandated.
Hot on the heels of regulatory compliance is e-discovery. E-mail records and other electronic files are routinely sought in legal discovery processes. ILM-related products such as e-mail archiving tools can make the discovery process manageable and help deliver the right information within the specified time frame.
As storage volumes grow, enterprises need a way to improve operational efficiency and manage the costs associated with that growth. An ILM strategy can help by identifying when data can be transitioned to lower-cost storage mediums or even destroyed. ILM also can play a role in business continuity plans. For instance, by reducing the size of a primary data store, applications can run faster and snapshots and backups are smaller.
Last but certainly not least, an ILM strategy can increase productivity by giving users enhanced visibility of their own data. In our survey of 291 readers, records management and file archiving were the most common reasons for undertaking an ILM initiative (see Figure 1, below).
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.