How to Safely Delete Obsolete Data

Obsolete data overloads storage resources and is a potential security risk. Learn how you can get rid of useless data safely and securely.

John Edwards, Technology Journalist & Author

November 13, 2023

4 Min Read
data flowing from a faucet
rico ploeg via Alamy Stock

At a Glance

  • The best way to handle obsolete data is to minimize the amount of data collected.
  • A reliable way to identify obsolete data is to periodically review all collected data and then draft a retention policy.
  • Data auditing tools, metadata analysis, and usage analytics can all help organizations sift through large repositories.

More data equals more risk, observes Elizabeth Nammour, co-founder and CEO of cybersecurity company Teleskope. “With privacy regulations, such as GDPR and CCPA, organizations need to be vigilant to protect all sensitive data or risk hefty fines for non-compliance,” she explains in an email interview. “There really is no choice -- you don’t store what you don’t need.”

When considering a data strategy, enterprise leaders should always prioritize quality over quantity, says Julie Mungai, a security and compliance advisory manager with consulting firm BARR Advisory via email. “This means only continuing to store data that's accurate and compliant,” she notes. “Anything else is obsolete, and the cost of retaining obsolete data far outweighs the benefits.”

Removing obsolete data helps organizations minimize both legal and compliance risks. Keeping dead data is risky business, Mungai warns. “Regularly removing obsolete data also helps businesses reduce costs, use their resources more efficiently, and, perhaps most importantly, build customer trust.”

Stockpiling large amounts of obsolete data can also lead to high storage costs, data management challenges, and an increased security risk. Additionally, retaining large amounts of obsolete data can place a burden on end users. “It’s more difficult to find relevant information if you need to sift through redundant or obsolete data,” says Laura Kurup, Accenture Federal Services' principal director, data science innovation, in an email interview.

Related:FTC to Require More Data Breach Reporting, Security Plan

Reliable Identification

A reliable way to identify obsolete data is to periodically review all collected data and then draft a retention policy based on sensitivity and need. Any data that doesn't need to be stored should be deleted promptly, Mungai states. “For instance, a company might set up an identity verification system to delete data immediately after verification,” she notes. Another example could be deleting users’ precise location data after 30 days if that data has served its purpose. “Automating these processes can help businesses avoid collecting large amounts of obsolete data.”

Many organizations store vast amounts of unstructured data, including email archives and file repositories that may contain a significant amount of obsolete data. Data auditing tools, metadata analysis, and usage analytics can all help organizations sift through large repositories. “If data is old and has not been accessed in a significant amount of time, it’s likely to be obsolete,” Kurup says. To establish a comprehensive data retention and deletion policy, she advises IT and business teams to work collaboratively to agree on a definition of “obsolete” for various types of data.

Related:7 Steps to a Data-Centric Nirvana

To achieve comprehensive privacy compliance, organizations should create a data map listing all of the types of personally identifiable information (PII) collected, processed, stored, or shared, says Beth Fulkerson, a privacy and data security partner at law firm Culhane Meadows via email. “A data management policy can be created in a similar manner, involving different stakeholders for different types of data and taking into consideration laws that require deletion rather than retention.”

Safer Handling

The best way to handle obsolete data is to minimize the amount of data collected. “Set retention periods upfront, and automate data disposal,” Mungai suggests. “By reducing the data that's collected to only the bare minimum, organizations significantly reduce the risk of inadvertently storing unnecessary data.” She also advises using automated data retention and disposal techniques, such as configuring retention schedules to automatically dispose of data when certain criteria are met. “This approach is effective because it's proactive instead of reactive,” Mungai notes.

Related:US Lawmakers Mull AI, Data Privacy Regulation

Kurup says organizations should plan for archiving or deleting data long before it becomes obsolete. “Establishing a strong data governance program allows organizations to manage the data lifecycle from the moment it's ingested or created,” she explains “If data is effectively managed and tagged, and a data retention and deletion policy is in place, data archiving and deletion can happen regularly and even be automated.”

A mistake many organizations make is focusing on data disposal as only a technology problem. Data storage may also impact the IT budget, which may provide the impetus for tackling obsolete data, Kurup says. Any effective solution will require close collaboration with business leaders. Simply identifying old or seldom-used data doesn't mean it's obsolete, she cautions. “It can be difficult to get business consensus on deletion without close collaboration on the approach.”

Key Components

Training and change management are key components of a successful data management program. “Adjusting user behavior related to email, file storage, tagging, and sharing can streamline data management and significantly reduce the amount of obsolete data created,” Kurup says. “To be successful, organizations should work to improve the user experience of these daily tasks, rather than simply establishing new policies and procedures.”

About the Author(s)

John Edwards

Technology Journalist & Author

John Edwards is a veteran business technology journalist. His work has appeared in The New York Times, The Washington Post, and numerous business and technology publications, including Computerworld, CFO Magazine, IBM Data Management Magazine, RFID Journal, and Electronic Design. He has also written columns for The Economist's Business Intelligence Unit and PricewaterhouseCoopers' Communications Direct. John has authored several books on business technology topics. His work began appearing online as early as 1983. Throughout the 1980s and 90s, he wrote daily news and feature articles for both the CompuServe and Prodigy online services. His "Behind the Screens" commentaries made him the world's first known professional blogger.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights