In the past, the terms “data engineer” and “regulatory compliance” did not typically intersect in a business conversation. Data engineers are hard at work building systems to collect raw data for data scientists to analyze. Regulatory compliance, on the other hand, is a realm for another department, like the legal or security team.
However, with the rise of international, state and industry data privacy laws, almost everyone within an organization has to understand how their job can help support compliance -- data engineers included. In fact, in the US, five states have passed data privacy legislation:
- California: Its previous data privacy law, the California Consumer Privacy Act (CCPA), will be replaced by the more comprehensive Consumer Privacy Rights Act (CPRA), effective Jan. 1, 2023.
- Colorado: The Colorado Privacy Act (CPA) goes into effect on July 1, 2023.
- Connecticut: The Connecticut Data Privacy Act (CDPA) goes into effect on July 1, 2023.
- Utah: The Utah Consumer Privacy Act (UCPA) goes into effect on Dec. 31, 2022.
- Virginia: The Virginia Consumer Data Protection Act (VCDPA) goes into effect on Jan. 1, 2023.
The fines for noncompliance can be pretty significant. The retailer Sephora was recently subject to a $1.2 million penalty for not disclosing to consumers that it was selling their data. Given the complexity of compliance -- and the fines -- it is no surprise that companies prioritize compliance-related efforts. In fact, a 2022 EMA survey found that 68% of organizations are using or are planning to use a regulatory compliance program as a market differentiator.
While compliance continues up the queue of business priorities, data use represents more risk than ever. Cloud migration and data sharing have replaced the infrastructure and perimeter of on-premises data storage. As a result, data protection requirements are more stringent as it has become more challenging to secure.
As a result, data engineers, like almost everyone else in the organization, must now understand how their use of and interaction with data impacts compliance risk. Further, there are new skill sets data engineers must possess to work in a compliance-heavy, cloud-first environment.
Let’s take a closer look at three areas where data engineers can support a company’s compliance efforts:
1. Data Sensitivity
How an organization protects data depends on its level of data sensitivity, and data engineers must have a deep understanding of what constitutes sensitive and non-sensitive data. In a compliance-heavy business environment, more sensitive data -- proprietary information, Social Security numbers, credit card numbers, addresses -- must have more rigorous protections when in use.
It is also essential to recognize that privacy laws define data sensitivity differently, which can complicate data protection. The CPRA, for example, has an extensive description of sensitive data and even goes so far as to add a subset called sensitive personal information. The CPA considers information related to a consumer’s sex life and mental and physical health conditions as sensitive data, and the VCDPA includes geolocation-information sensitive data. When data engineers can identify data sensitivity -- and take the proper actions based on that sensitivity -- they can drastically reduce the risk of noncompliance.
2. Use Requirements
Once data is classified for sensitivity, it is much easier to determine how to handle it, depending on what will be done with the data. As we know, data is constantly moving from on-premises to the cloud. It is shared inside and outside the organization and moved from high to low environments. Companies must incorporate strong protection methods as data flows from location to location.
For example, there are times when a piece of sensitive, de-identified data must be re-identified. In these cases, the appropriate protections must be in place to ensure re-identified data is not exposed. And from there, data engineers must know how it can be used in its current state to ensure compliance. There’s no linear prescription -- a “do this, do that” model. Compliance varies based on unique combinations of countries, states, and industries in which a business operates.
3. Security Controls
Before the great migration to the cloud, data security controls were straightforward: Protect the network's perimeter to prevent unauthorized access to data. Since the cloud has no perimeter and data is continuously moving, companies must implement more security controls. And the form of control necessary depends on the data type and usage.
Data engineers must become acutely aware of the myriad controls in use -- like understanding the difference between masking, tokenization and encryption, and when and how they should be applied without diminishing the data’s value. While data engineers aren’t responsible for applying these controls, they need to understand to whom they can and cannot send data. And they need to have enough background on security controls to spot when a protection method has not been applied or has been misapplied.
Data privacy regulations are only increasing, as are high expectations related to data protection. It takes a 360-degree view by everyone in the organization to keep people and data safe. Data engineers play a big part in helping organizations reduce risk. Keeping abreast of the latest data privacy laws and dedicating a portion of their time to educating themselves on data sensitivity, use requirements and security controls will go a long way to ensuring privacy and compliance become part of a company’s fabric in the long term.