Data Sanitization involves the secure and permanent erasure of sensitive data from datasets and media to guarantee that no residual data can be recovered even through extensive forensic analysis.[1] Data sanitization has a wide range of applications but is mainly used for clearing out end-of-life electronic devices or for the sharing and use of large datasets that contain sensitive information. The main strategies for erasing personal data from devices are physical destruction, cryptographic erasure, and data erasure. While the term data sanitization may lead some to believe that it only includes data on electronic media, the term also broadly covers physical media, such as paper copies. These data types are termed soft for electronic files and hard for physical media paper copies. Data sanitization methods are also applied for the cleaning of sensitive data, such as through heuristic-based methods, machine-learning based methods, and k-source anonymity.[2]

This erasure is necessary as an increasing amount of data is moving to online storage, which poses a privacy risk in the situation that the device is resold to another individual. The importance of data sanitization has risen in recent years as private information is increasingly stored in an electronic format and larger, more complex datasets are being utilized to distribute private information. Electronic storage has expanded and enabled more private data to be stored. Therefore it requires more advanced and thorough data sanitization techniques to ensure that no data is left on the device once it is no longer in use. Technological tools that enable the transfer of large amounts of data also allow more private data to be shared. Especially with the increasing popularity of cloud-based information sharing and storage, data sanitization methods that ensure that all data shared is cleaned has become a significant concern. Therefore it is only sensible that governments and private industry create and enforce data sanitization policies to prevent data loss or other security incidents.

While the practice of data sanitization is common knowledge in most technical fields, it is not consistently understood across all levels of business and government. Thus, the need for a comprehensive Data Sanitization policy in government contracting and private industry is required in order to avoid the possible loss of data, leaking of state secrets to adversaries, disclosing proprietary technologies, and possibly being barred for contract competition by government agencies.  

With the increasingly connected world, it has become even more critical that governments, companies, and individuals follow specific data sanitization protocols to ensure that the confidentiality of information is sustained throughout its lifecycle.  This step is critical to the core Information Security triad of Confidentiality, Integrity, and Availability.  This CIA Triad is especially relevant to those who operate as government contractors or handle other sensitive private information.  To this end, government contractors must follow specific data sanitization policies and use these policies to enforce the National Institute of Standards and Technology recommended guidelines for Media Sanitization covered in NIST Special Publication 800-88.[3] This is especially prevalent for any government work which requires CUI (Controlled Unclassified Information) or above and is required by DFARSClause 252.204-7012, Safeguarding Covered Defense Information and Cyber Incident Reporting [4]While private industry may not be required to follow NIST 800-88 standards for data sanitization, it is typically considered to be a best practice across industries with sensitive data. To further compound the issue, the ongoing shortage of cyber specialists and confusion on proper cyber hygiene has created a skill and funding gap for many government contractors.

However, failure to follow these recommended sanitization policies may result in severe consequences, including losing data, leaking state secrets to adversaries, losing proprietary technologies, and preventing contract competition by government agencies[5].  Therefore, the government contractor community must ensure its data sanitization policies are well defined and follow NIST guidelines for data sanitization.  Additionally, while the core focus of data sanitization may seem to focus on electronic “soft copy” data, other data sources such as “hard copy” documents must be addressed in the same sanitization policies.

To examine the existing instances of data sanitization policies and determine the impacts of not developing, utilizing, or following these policy guidelines and recommendation, research data was not only coalesced from the government contracting sector but also other critical industries such as Defense, Energy, and Transportation.  These were selected as they typically also fall under government regulations, and therefore NIST (National Institute of Standards and Technology) guidelines and policies would also apply in the United States.  Primary Data is from the study performed by an independent research company Coleman Parkes Research in August 2019[6].  This research project targeted many different senior cyber executives and policy makers while surveying over 1,800 senior stakeholders. The data from Coleman Parkes shows that 96% of organizations have a data sanitization policy in place; however, in the United States, only 62% of respondents felt that the policy is communicated well across the business.  Additionally, it reveals that remote and contract workers were the least likely to comply with data sanitization policies.  This trend has become a more pressing issue as many government contractors and private companies have been working remotely due to the Covid-19 pandemic. The likelihood of this continuing after the return to normal working conditions is likely.

On June 26, 2021, a basic Google search for “data lost due to non-sanitization” returned over 20 million results.  These included articles on; data breaches and the loss of business, military secrets and proprietary data losses, PHI (Protected Health Information),[7]PII (Personally Identifiable Information),[8] and many articles on performing essential data sanitization. Many of these articles also point to existing data sanitization and security policies of companies and government entities, such as the U.S. Environmental Protection Agency, “Sample Policy and Guidance Language for Federal Media Sanitization”.[9] Based on these articles and NIST 800-88 recommendations, depending on its data security level or categorization, data should be:[3]

  • Cleared – Provide a basic level of data sanitization by overwriting data sectors to remove any previous data remnants that a basic format would not include. Again, the focus is on electronic media. This method is typically utilized if the media is going to be re-used within the organization at a similar data security level.
  • Purged – May use physical (degaussing) or logical methods (sector overwrite) to make the target media unreadable. Typically utilized when media is no longer needed and is at a lower level of data security level.
  • Destroyed –   Permanently renders the data irretrievable and is commonly used when media is leaving an organization or has reached its end of life, i.e., paper shredding or hard drive/media crushing and incineration.  This method is typically utilized for media containing highly sensitive information and state secrets which could cause grave damage to national security or to the privacy and safety of individuals.

