CDMC Framework – 4.0 Protection & Privacy

4.0 Protection & Privacy

CDMC Framework Component 4

Upper Matter

Introduction

Protecting the content and the privacy of data in the cloud is a critical requirement in today’s cloud environments. Organizations that employ cloud computing technology may be required to comply with multiple jurisdictions' data protection and privacy legislation. The compliance burden can be quite heavy for an organization that is a member of a regulated industry. Teams planning integration with a cloud service provider (CSP) must exhibit data protection and privacy capabilities that are sufficient to meet both internal policy mandates and external regulatory requirements.

Description

The Protection & Privacy component is a set of capabilities for collecting evidence that demonstrates compliance with the organizational policy for data sensitivity and protection. The purpose of these capabilities is to ensure that all sensitive data has adequate protection from compromise or loss as required by regulatory, industry and ethical obligations.

Scope
  • Implement a Data Loss Protection regime.
  • Provide evidence that demonstrates the application of required data security controls.
  • Define and approve a data privacy framework.
  • Operationalize the data privacy framework.
  • Apply data obfuscation techniques to all data types according to classification and security policies.
Overview

Effective and timely management of a large IT infrastructure demands that data protection and privacy evidence collection must be reliable, consistent and highly automated. Many organizations sensibly view an external CSP environment as posing a higher risk than an internal system and thereby conclude this additional risk necessitates more stringent controls. Additional risk factors do come into play with hybrid-cloud solutions and the complexity of feature-variation among multiple CSPs.

The numerous challenges of adding complexity and risk to an existing framework of data protection controls can significantly hinder the adoption of attractive technologies. It is vital to identify and implement best practices for ensuring data protection to balance the risks and rewards of integrating with a CSP.

Managing sensitive data entails risk. Implementing data protection controls is the most effective approach toward mitigating the universal threats of disclosure, alteration, misuse and repudiation. Effective risk management requires balance. The Data Manager must apply and monitor adequate data protection controls while maintaining ready access to sensitive data for operational and analytical uses.

Organizations should adopt a Zero Trust framework to limit access to specific applications and resources to authorized users. The Zero Trust model assumes breach and verifies each request as though it originates from an open network. Zero Trust teaches us never to trust and always verify regardless of where the request originates or what resource it accesses. Every access request is fully authenticated, authorized, and encrypted before granting access. Micro segmentation and least privileged access principles are applied to minimize lateral movement.

Securing sensitive data in a cloud environment requires transferring some responsibility for comprehensive data security to the CSP. In this shared responsibility model, it is vital to ensure the accountability of each participant.

When preparing to transfer data management into a cloud environment, each of these steps must be followed:

  • Apply adequate levels of data encryption to every CSP data transmission and data store.
  • Demonstrate data protection controls that enforce organization policies and privacy classifications.
  • Ensure controls are fully effective across the entire data lifecycle.
  • Conform sensitive data access permissions to the principles of need-to-know and access-by-least-privilege while balancing data usability needs across the organization.
  • Ensure Data Loss Prevention controls are in place, minimizing the ability to exfiltrate data.

Data protection requirements should be driven by a data classification scheme to ensure that the right controls operate correctly, in the right place and at the right time. Refer toCDMC 2.0 Cataloging & Classification.

The table gives an example of a simple information sensitivity classification scheme for a cloud environment.

Data classification Cloud Environment Encryption
Public No Requirement
Internal Use only Data-in-motion: Encrypt
Data-at-rest: Service-Based or above
SDLC Use: No Requirement
Confidential Data-in-motion: Encrypt
Data-at-rest: Service-Based or above
SDLC Use: Protection needed
Highly Confidential Data-in-motion: Encrypt
Data-at-rest: Application Level Encryption
SDLC Use: Protection needed
Price-sensitive and Secret Data-in-motion: Encrypt
Data-at-rest: Application Level Encryption
SDLC Use: Protection needed

A data privacy framework consists of the people, processes, data and technologies that support business needs, satisfy regulatory obligations, promote trust and deliver appropriate risk-balanced data privacy outcomes. Data privacy encompasses both organizations' and individuals' obligations and rights to manage personal sensitive data. Data management practices and controls must be trustworthy, ethical and compliant throughout the entire data lifecycle.

An organization that integrates with a cloud environment must consider how the integration should reshape its data privacy framework. An organization implements and operationalizes its data privacy framework through a data privacy program, which typically addresses each of the following:

  • Accountability, governance and oversight mechanisms
  • Documented policies, procedures and processes
  • Documented roles and responsibilities
  • Privacy operations and supporting technology
  • Training

A cloud environment impacts data privacy in many ways, including:

  • Availability of functionality.Cloud technology provides functionality, opportunities and approaches for managing data across the entire data lifecycle. Any serious consideration of adopting new cloud computing technologies should review and enhance the data privacy framework.
  • Jurisdictional diversity.Cloud computing functionality and opportunities increase the potential for data to traverse multiple local or regional jurisdictions. Consequently, a data privacy framework must be flexible and resilient to accommodate many types of legal or regulatory requirements. Refer toCDMC 1.4 Data Sovereignty and Cross-Border Data Movement are Managed.
  • Shared responsibility.Operational use of commercial cloud environments is a shared responsibility model. Final responsibility and regulatory accountability remain with the organization that is adopting the cloud technology. Consequently, it is essential that any contract with the CSP clearly defines and delegates roles, accountabilities, responsibilities, metrics and measures. To consistently implement its data privacy framework across all operations, the organization must obtain complete clarification on all expectations and responsibilities of the CSP.
  • Proliferation of data.Cloud environments offer low storage costs and easy data movement, so the risk of data proliferation to multiple data consumers is much higher. Consequently, the risk of privacy violations or breaches also increases.
Value Proposition

An organization that consistently implements data protection controls will adopt new cloud computing technologies more rapidly and effectively. Also, systems that operate with integral data protection controls are more cost-efficient than retrofitting custom controls.

Applying information sensitivity classification standards to integrations with CSPs can greatly improve management, monitoring, enforcement and automation of data privacy controls that meet internal, industry and regulatory requirements.

Historically, some organizations have been hesitant to effect CSP integrations, primarily because of security concerns. Cloud services have made significant improvements to security and privacy capabilities, integral automation and transparency. These improvements allow effective and efficient privacy risk management across the entire data lifecycle through the application of privacy-by-design.

Organizations preparing to integrate cloud computing can access extensive expertise to manage large-scale data repositories in cloud computing environments.

Core Questions
  • Has a Data Loss Prevention regime been established?
  • Does a documented encryption policy support an approved encryption strategy?
  • Can the organization provide evidence of data security controls?
  • Is the data privacy framework updated to manage the impact of cloud adoption and integration?
  • Is the internal data privacy framework in operation?
  • Is the data privacy framework in operation to cover all CSP integrations?
  • Have data obfuscation techniques been selected, supported and applied?
Core Artifacts
  • Data Privacy Framework – that reflects requirements for the cloud
  • Data Privacy Controls Log – that demonstrates the effectiveness of the controls
  • Data Obfuscation and Encryption Strategy
  • Data Management Policy, Standard and Procedure – defining and operationalizing data obfuscation and encryption
  • Data Loss Prevention Methodology – that includes roles and responsibilities
  • Data Security Controls Log – that demonstrates the effectiveness of the controls

4.1 Data is Secured, and Controls are Evidenced

The organization’s policy for the encryption of data must be extended to cloud environments. They must be enforced for data-at-rest, in motion and in use and evidence of the implementation of these controls must be captured. Securing data goes beyond encryption. Techniques for the obfuscation of sensitive data must be supported and adopted in all environments. A Data Loss Prevention regime must be in place and must cover both on-premises and cloud environments.

Description

Data assets are classifiable by sensitivity level. For each combination of state and sensitivity level, a data encryption standard must be enforced by implementing suitable encryption procedures available through a cloud service provider (CSP). Refer to CDMC 1.2 Data Classifications are Defined and Used.

Objectives
  • Protect sensitive data with encryption to mitigate threats, including disclosure, modification, misuse, or attack.
  • Protect sensitive data with encryption to a level that is acceptable to the organization.
  • Protect sensitive data with encryption to a level specified by regulatory obligations.
  • Consistently apply encryption to the extent that it meets or exceeds the risk level and corresponds to the organization's risk appetite and data ethics.
  • Consistently apply an encryption key management scheme that envelops acceptable risk, the potential for functional loss and operational complexity.
Advice for data practitioners

Data can exist in one of three states:

  • Data-at-rest
  • Data-in-motion
  • Data-in-use
  • Encryption of data-at-rest

    Data-at-rest is data that resides in physical storage and is not in transit. This includes data residing in a database, a file or on disk. An organization should encrypt all data-at-rest to mitigate the risks of malicious actions such as disclosure, changes to sensitive information or unauthorized access. It is also important to consider applying this type of encryption for archived data.

    All CSPs offer some form of encryption for data-at-rest, which may be service-based or server-side encryption. A CSP may permit an organization to manage the encryption key lifecycle and thereby control how applications and services use the keys. Also, an organization may choose to generate encryption keys and store those keys in a hardware security module (HSM) provided by the CSP. Another common method is for the organization to import encryption keys into the CSP encryption solution while retaining backup copies in an on-premises HSM. See the Encryption Key Management Schemes section below for more detail on these choices.

    Encryption of data-in-motion

    Data-in-motion should be encrypted to ensure that it is accessible only to the intended recipients and entirely impenetrable to any potential interceptor.

    Encrypting data-in-motion considerations apply to various parts of data architecture, including API calls to CSP service endpoints, data transfers among CSP service components and data movements within applications. The first two considerations are the CSP's responsibility, and the last consideration is the organization's responsibility. The organization must also consider encrypting data-in-motion for any data movements between the organization and any third party.

    A Transport Layer Security (TLS) protocol should be used for encrypting all data-in-motion. For example, as of this writing, NIST SP 800-52 provides specific guidance for selecting and configuring TLS protocol implementations. Consider employing Federal Information Processing Standards (FIPS) 140-2 endpoints, if applicable. Such endpoints use a cryptographic library that meets the FIPS 140-2 standard. For financial institutions that manage workloads on behalf of the US government, the use of FIPS 140-2 endpoints may be mandatory to satisfy government compliance requirements.

    Encryption of data-in-use

    Data-in-use is data in the process of modification or maintenance. Until recently, it has been necessary for data to be decrypted in memory during processing. Privileged users such as DBAs, system administrators and CSP operators may access such plaintext data in memory. Cyber intruders may illicitly gain access to such data.

    As of this writing, preventative encryption controls for data-in-use are at an early stage of industry development for private and public Software-as-a-Service. Data practitioners should perform risk analysis and evaluate the use of any preventative controls that are part of the emerging confidential computing model1. Similar analysis should be done for compensating and detective controls such as just-in-time (JIT) privileged access, per-access customer authorization of administrative logins into a lockbox and Security Information and Event Management (SIEM) solutions that monitor potential breaches.

    Application-level encryption

    Organization-side encryption encrypts sensitive data elements before transmission to any storage environment such as a database or cloud storage. Applying this type of encryption ensures that sensitive data elements will be encrypted before reaching the CSP. Because a CSP doesn't have access to the organization’s encryption keys, it cannot decrypt the data. It is important to realize that the inability to decrypt the data may limit, degrade or disable the CSP functions for querying the data.

    It is possible to combine application-level encryption with three other encryption types to achieve multiple layers of protection. Refer to CDMC 4.1.3 Data obfuscation techniques are defined and applied for other alternatives for protecting application data elements.

    Encryption key management schemes

    For an introduction to confidential computing, see e.g., “The Rise of Confidential Computing” in IEEE Spectrum, June 2020.

    Encryption is useless if the encryption keys are not secure. Most CSPs offer several different key management solutions to accommodate the requirements of various data classifications. Much of the difference among these solutions pertain to the shared management of the encryption keys. The table below explains several key management schemes.

    Key Management Responsibility Key Management Scheme (KMS)
    CSP CSP-managed keys: Organizations delegate responsibility to the CSP for generating, managing and controlling keys throughout the data lifecycle. This option is available with most CSPs.
    Shared option 1 Organization-managed encryption keys: The CSP key management scheme is used for the entire encryption-key lifecycle. The CSP and other organization-operated services may be permitted to use keys for encryption and decryption of organization data.
    Shared option 2 Organization-supplied encryption keys (bring-your-own-key): The organization operates key management processes and infrastructure external to the CSP. Organizations upload their encryption keys to the CSP's key management scheme. The CSP and other organization-operated services may be permitted to use keys for encryption and decryption of organization data.
    Organization Organization-side key management (hold-your-own-key): An organization's internal key management infrastructure generates its keys. These keys encrypt data before transmitting it to the CSP. The CSP and other organization-operated services may be permitted to use keys for encryption and decryption of organization data.

    Irrespective of the key management scheme, a data practitioner should verify that technologies and practices for managing encryption keys meet the organization's current standards, guidelines, and regulatory requirements. Encryption keys are sensitive and business-critical. The use of encryption keys should be restricted to authorized applications and users. Restrictions should also apply to processes that validate access permissions. In particular, the data practitioner should be aware of these relevant technologies and practices:

    • Options that employ role-based access control (RBAC) and least-privilege access principles to limit access to encryption keys.
    • Network-level access controls restrict the management of encryption keys wherever possible.
    • Configuring recovery options, such as soft-delete and purge protection, to prevent accidental or malicious key deletion.
    • Encryption key lifecycle policy and procedures that include periodic key rotation and immediate (emergency) key rotation.
    • A system for retaining data in a storage account that is (a) under organization control, (b) managed with policy restrictions and (c) employs configurations that are readily verifiable against the policy restrictions.
    • Trustworthy log retention and management system for tracking and auditing key usage events such as encryption and decryption operations.

    Example of an encryption policy

    An organization may have specific risk or impact profiles that require a precise set of encryption controls. The controls given in the table provide an example of an encryption policy. Controls should be customized to match the need and risk appetite of the organization.

    State Encryption in Transit Encryption at Rest Encryption in Use Application-level encryption
    Sensitivity Level Data in packets on the wire Data in non-volatile memory Data in volatile memory Application data fields
    Critical/Secret (extreme loss or harm, includes highly sensitive data, payments data) Required Required, with organization-managed keys Considered Required, with organization-managed keys
    Highly Confidential (material loss or risk) Required Required Not required Required
    Confidential Required Required Not required Not required
    Internal use only Required Required Not required Not required
    Public Not required Not required Not required Not required
    Advice for Cloud Service and Technology Providers

    Cloud service and technology providers should integrate application-level encryption capabilities into managed services that store sensitive data. This integration will make it easier for organizations to use application-level encryption and derive value and gain insights from the encrypted data. For cases where application-level encryption is impossible or impractical (as with some machine learning workloads), the provider should provide built-in mitigating capabilities described in theEncryption in Usesection above.

    Providers should facilitate evidence of the existence of encryption controls and their operational effectiveness across a broad set of cloud data resources and across large cloud environments where many accounts, regions and data services exist.

    CSPs should continue to innovate encryption and non-encryption controls for protecting all data-in-use against unauthorized access. Data-in-use includes active data in non-persistent memory such as RAM, CPU caches and CPU registers. Data-in-use often contains sensitive data such as digital certificates, encryption keys, personally identifiable information and intellectual property such as software algorithms and design data. Conventional encryption technologies do not protect data-in-use.

    Cryptographic protection has become a growing concern to businesses, government agencies and other institutions. Threats to data-in-use include cold-boot attacks, the connection of malicious hardware devices, rootkits, bootkits and side channels. Compromising data-in-use often exposes encrypted data-at-rest and data-in-motion as well. For example, an unauthorized user with access to RAM can locate an encryption key for data-at-rest and access sensitive data.

    Questions
    • Has an encryption policy been documented and approved?
    • Does the encryption policy document accurately portray the risk exposure and the desired level of protection for each category of data in the cloud?
    • Do the encryption capabilities offered by the CSP include options for key management and have these capabilities been documented and assessed?
    • Have the organization's security and privacy stakeholders reviewed and approved the data encryption and encryption key management strategies?
    • Are monitoring, logging and alerting measures in place to monitor the operational effectiveness of the encryption strategy?
    • Has a regime been established for reviewing key management practices and technology in use by both the CSP and internal staff?
    Artifacts
    • Data Encryption Strategy
    • Data Management Policy, Standard and Procedure – defining and operationalizing data encryption
      • Before installation or upgrade of an application
      • Before migrating data to a CSP
    • Data Catalog – containing all classification information necessary for protecting data
    • Security Treatment Plan – containing each application's level of encryption and other risk mitigations
    • Encryption Strategy Operational Effectiveness Logs
    • Key Management Review Procedure – covering review of key management practices and technology in use by both the CSP and internal staff
    Scoring

    Not Initiated

    No formal data encryption policy exists.

    Conceptual

    No formal data encryption policy exists, but the need is recognized, and the development is being discussed.

    Developmental

    Formal data encryption policy is being developed.

    Defined

    Formal data encryption policy is defined and validated by stakeholders.

    Achieved

    Formal data encryption policy is established and adopted by the organization.

    Enhanced

    Formal data encryption policy is established as part of business-as-usual practice with continuous improvement.

    Description

    Data security policies require establishing data protection controls for any data element that qualifies for one or more information sensitivity classifications. Design and implementation of these controls must be done early in a system or software development project. However, design and implementation are necessary but not sufficient to demonstrate compliance with policies. As part of an internal or external audit, it may be necessary to obtain evidence of recent application of the controls and the extent to which those controls have been effective.

    The sub-capability requires the inclusion of observable and collectible evidence that demonstrates the presence of data protection controls. The evidence must link directly to data catalogs and applicable information sensitivity classifications.

    Evidence should be obtainable from native, local, or third-party applications using Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), or Software-as-a-Service (SaaS) services. Wherever it is practicable, evidence collection should be automatic. When the evidence reveals exposed sensitive data or evidence indicates missing or deficient controls, a resolution plan must be documented to remedy such deficiencies.

    Objectives
    • Define a method for obtaining evidence of data protection controls.
    • Proactively collect evidence that sensitive data is secure and complies with the organization's data classifications and data handling policies.
    • Implement a method of ingesting, storing, processing and analyzing evidence.
    • Collect evidence that sensitive data is secure according to regulatory obligations.
    • Demonstrate that the organization consistently applies controls for securing data according to risk appetite and data ethics.
    Advice for data practitioners

    Know where controls are necessary

    An organization should use data classifications to identify the data elements that must be secure when stored in a cloud environment. The organization should establish sufficient controls for securing all sensitive data. Activity logs and continuous compliance checks should provide evidence of the controls. Refer toCDMC 2.2 Data Classifications are Defined and Used.

    Know what controls are necessary

    Some controls are implemented as a default configuration that is broad in scope, such as always-encrypted data storage volumes. However, it may be necessary to identify and enable specific controls wherever precise verification is mandatory. Other examples of specific validation include periodic verification of authorized access to sensitive data elements and exhaustive transaction monitoring with a logging facility.

    Observable evidence in custom applications

    All custom applications must comply with data control requirements defined by the organization's privacy and data security policies. Third-party or open-source functions must also accommodate compliance with required data controls. Various architectural patterns and open-source technology frameworks such as Fintech Open Source Foundation (FINOS) are available for adding, managing and observing controls. Also, the organization should consider the best approach for implementing standards for controls in data management systems.

    Establish data controls in systems and services

    Wherever practicable, data protection controls should be implemented in each system. A cloud service provider will typically provide configurations and deployment options to activate controls for the cloud environment or some cloud services. Controls that are implemented for each of the data management systems should be readily observable.

    Evidence collection is also easier to implement by engaging with the CSP APIs for accessing system logs and service configurations.

    Infrastructure-as-code (IaC) templates

    Organizations should consider using standardized, automatic and repeatable templates for IaC, which can be quite helpful in implementing appropriate data protection controls to secure data. IaC templates, for example, can automatically activate data encryption for data-at-rest and data-in-motion configurations. In addition, IaC templates can help simplify evidence collection by moving the focus toward general segments of the deployment pipeline.

    Collect and report evidence

    Custom applications, as well as IaaS, PaaS and SaaS solutions, may have various data protection controls and various mechanisms for gathering evidence for those controls. To help document evidence that sufficient controls exist, organizations should clearly understand how data is ingested through each data source. This evidence includes logs, IaC artifacts and CSP configuration settings.

    Treatment of gaps in evidence collection

    In a typical organization, many and varied data protection controls are operating independently in several system contexts.Since gaps may exist in an application or the ability of the CSP to collect evidence automatically, additional evidence may be necessary to show that all controls are satisfying all policy objectives.The organization should plan to resolve these gaps.

    Advice for Cloud Service and Technology Providers

    Data practitioners increasingly rely on cloud service providers to manage and store critical data in many contemporary organizations. CSP agreements with an organization commonly define a shared burden for enforcing the protection of sensitive data. This shared burden means that a CSP has partial responsibility for ensuring data protection and supporting evidence collection that demonstrates the security of sensitive data. The CSP should offer interfaces, tools, logs and reports that data practitioners can readily access as they collect and exhibit evidence for all active data protection controls.

    • CSPs should ensure that adequate evidence for active data protection controls is readily available through APIs that provide access to the CSP logs, service configurations and continuous compliance monitoring tools.
    • The CSP should provide a simple method to support evidence collection on active data protection controls for applications and services used to store and manage sensitive data.
    • The CSP should provide simple methods for integrating subsystems that gather and convey evidence from data catalogs and classification systems.
    • The CSP should support always-on controls to secure data for the entire organization.
    • The CSP should provide a near real-time inventory of available cloud resources, including data stores.
    • Across all services, the CSP should provide compliance monitoring tools that automatically detect and report any changes in the data security configuration. This reporting can greatly simplify audits and reviews.
    • The CSP should provide a reliable repository of evidence that will meet inspection requirements for control functions. This repository should provide simple data extraction and inspection methods and a reliable archiving solution that ensures complete data integrity.
    Questions
    • Has a method for providing evidence of controls been defined?
    • Are policies and design practices in place for any custom applications that have been deployed to the CSP?
    • Does the CSP provide methods for monitoring and ensuring that mandatory controls are active and functioning properly?
    • Do processes exist for identifying and adjusting any misconfiguration of active controls?
    • Is there agreement on how to store evidence and compare it to the catalog of active controls?
    • Do processes exist for identifying and resolving any gaps in the active controls?
    Artifacts
    • Data Catalog Report – evidencing execution of required data classifications
    • Active Controls Log
    • Evidence Collection and Review Plan
    • Issue Management Report – evidencing capture and resolution of data security defects
    • Applications Security Treatment Plan – listing the controls, required evidence, and necessary mitigations
    Scoring

    Not Initiated

    No formal ability to evidence the implementation of security controls exists.

    Conceptual

    No formal ability to evidence the implementation of security controls exists, but the need is recognized, and the development is being discussed.

    Developmental

    The formal ability to evidence the implementation of security controls is being developed.

    Defined

    The formal ability to evidence the implementation of security controls has been defined and validated by stakeholders.

    Achieved

    The formal ability to evidence the implementation of security controls is established and adopted by the organization.

    Enhanced

    The formal ability to evidence the implementation of security controls is established as part of business-as-usual practice with continuous improvement.

    Description

    An organization derives information from all kinds of data to operate and drive business. For organizations that interact with customers and other organizations, much of the data is sensitive or proprietary—or both. It is essential to implement security and privacy measures that protect the interest of data consumers and custodians of the data. In contemporary computing, a variety of data obfuscation techniques are available for protecting data. Techniques should be chosen according to sensitivity classification, business requirements and organizational risk appetites. In addition, it is essential to define policies and standards that specify the application of obfuscation techniques to datasets with varying sensitivity classifications.

    Objectives
    • Define effective and consistent data obfuscation techniques for mitigating data security concerns.
    • Define the criteria for the appropriate application of various obfuscation techniques.
    • Ensure highly secure controls for the reversibility of obfuscation techniques applicable to each data element.
    • Ensure consistent application of obfuscation techniques to all linked datasets.
    • Ensure that quasi-identifiers are obfuscated if those identifiers are combinable to reveal the identity of an individual.
    • Ensure traceability for any obfuscated data, including the ability to track and control the dissemination of such data.
    Advice for data practitioners

    Data obfuscation is the process of obscuring, redacting, or transforming all or part of a data element to prevent the identification of parties or inappropriate disclosure of private information that may be contained in that data. Typically, this involves substituting placeholder data to represent the actual data. In this approach, data values classified as sensitive are altered so that the original values are no longer available.

    In general, there are three types of data obfuscation:

    1. Encryptiontransforms original plaintext data into ciphertext, using an encryption algorithm and an encryption key as input. Decryption converts ciphertext back to the original plaintext data and requires a separate decryption algorithm and a decryption key. Refer toCDMC 2.2 Data Classifications are Defined and Used.

    2. Data maskingreplaces an original value with a character string that results from a data-masking function. Masking may involve substitution, shuffling, or more complex manipulation that obfuscates data while preserving some of the statistical properties of the original data set (such as stochastic perturbance). Data-masking functions apply to data-at-rest (static data masking) or data-in-transit (dynamic data masking). The original data cannot be exposed by applying any formula to the masked value. Suppose data is masked before ingestion into a data repository. In that case, there is a low risk of exposing sensitive data if that repository is breached (since the contents of the masked elements are fabricated).

    3. Tokenizationis the process of substituting a sensitive data element with a non-sensitive equivalent known as a token. Such tokens have no extrinsic or exploitable value. The token is a unique reference (identifier) mappable to sensitive data using a highly secure tokenization scheme. Tokenization typically occurs when creating or importing sensitive data into a system. Tokens require significantly less computational resources to process than either encryption or data masking. However, tokenization requires a mapping table and high-security measures. Tokenization hides sensitive data while substituting comparable data for processing and analytics. Typically, tokenized data can be processed more quickly, a key advantage in high-performance systems.

    Selecting from among various data obfuscation techniques must consider the business requirements and desired outcomes. Criteria that would be used to determine applicable obfuscation techniques include:

    • Data utility – does the use case require a technique that renders an obfuscated value that retains some measure of accuracy or referential integrity?
    • Sensitivity classification of data.
    • Location of the data store.
    • Define each of the data perimeters at which specific data elements must be obfuscated.
    • Define which type of obfuscation (or combination of types) is best applicable to specific data stores or data elements.

    The table lists a variety of data obfuscation techniques and recommendations.

    Category Technique Definition Properties Best Practices and Appropriate Use
    Encryption Field-level encryption Replaces the field value with an encrypted value, which derives from the source using a cipher and a key. Format preservation: Possible (limited) through format-preserving encryption.Reversible: Yes The focus here is on field-level encryption. Field-level encryption is a less useful alternative to tokenization (e.g., organization-side encryption may not fit the processing model as the system may be too complex to support).
    Masking Partial redaction Partially masking out the field value preserves the general form of the data value to assist in recognizing the data type (such as hiding all but the last four digits of a credit card number). Format preservation: Possible (limited), some formatting types are preserved, such as value type—but not length.Reversible: No This technique is suitable when a value section contains information not dependent on the rest of the value, such as zip codes and credit card numbers.
    Full redaction The entire data value is replaced with a single repeated value, such as “XXXXX.” Format preservation: Possible (limited). The length of the value can be shown with a sequence of masking characters.Reversible: No Using redaction as the default supports the principle of data minimization and encourages data users to justify why each field should be retained.
    Generalization Also known as coarsening, this technique is useful in decreasing the data precision or granularity. Format preservation: Possible (limited).Reversible: No Generalization can be used to prevent linkage attacks.Examples of generalization include rounding decimal-based coordinates, numeric quantities such as age and zeroing out the last octet of an IP address.
    Stochastic Perturbation Replaces an input with a value that has been perturbed by adding or subtracting a small amount of random zero-mean noise. Format preservation: Yes Reversible: No Perturbation aims to protect against identification when an attacker might know a specific value in the dataset. For example, sensitive values in a transaction could be perturbed by any full-unit value in a range, such that an input value of $173 could be perturbed by +/- $7 to generate an output value in the range $166-$180.
    Substitution & Shuffling Replaces an input value or group of values with a value that is taken from a predefined mapping.If the replacement value is from the same domain of the masked data, it is called shuffling. Format preservation: Possible (limited).Reversible: Possible, but becomes increasingly impractical for cases in which the dataset has many unique values. Substitution allows for general control of the data, but the tradeoff is the substantial effort in configuring the substitution values.An example is the configuration of a mapping in which another unique name replaces every name.This scheme is format-preserving, reversible, and provides referential integrity. However, an increasing number of names also increases the effort to create and maintain the substitution list.
    Tokenization Replaces an input value with a random token that has no extrinsic or exploitable meaning or value. Format Preservation: Yes Reversible: Yes Linkable: Yes Traceable: Yes Tokenization should be applied as the default for all sensitive values that are not redacted. Consistently tokenized columns retain information about which records share the same value. Therefore, it is possible to calculate frequency distributions, perform analysis, and train machine learning models on consistently tokenized data with no loss of utility.

    Obfuscation controls should be applied following regulatory obligations, company standards and risk appetite. Multiple complementary controls may be applicable to ensure sufficient security in multiple domains. Options include encrypting entire data objects, masking specific sensitive data elements, and tokenizing other data elements—according to organizational policies that apply to each of the various data elements.

    Obfuscate the data at the earliest opportunity – preferably when the data is created or imported into the cloud environment. Decide whether it is necessary to preserve referential integrity for some or all of the output domains. Establish strict controls for any application or user requests to reverse the obfuscation. Any access request should be logged for audit purposes.

    Sensitive data should not move to a lower-grade environment such as QA or Development. If there is an approved business requirement to move the sensitive data, it must be obfuscated at the migration point.

    Tokenization for cloud data storage and exchange

    Consider these best practices for tokenization of data that is stored in a cloud environment:

    • Tokenization is perhaps the best technique for direct identifiers and should be applied near the beginning of the data lifecycle.
    • Any tokenization capability should work across all on-premises and cloud environments in an organization.
    • Before implementation of a tokenization system, verify compatibility with any application that will depend on that system.
    • Access to the tokenization mapping table must be secure according to organizational standards, and the output domain must be sufficiently large to ensure resilience to brute force attacks.
    • A systems detokenization should occur only when no viable alternative is available and authorization is explicitly given.
    • Use different tokens for different systems to ensure traceability and mitigating the risk of sensitive data exposure (that would otherwise occur by linking datasets).
    Advice for cloud service and technology providers

    To effectively support an organization as it implements data obfuscation solutions, cloud service providers and technology providers should:

    • Offer the organization sufficient transparency for identifying each of the sensitive data elements throughout the cloud environment.
    • Provide native capabilities and integrations with data obfuscation tools that operate across major cloud platforms and on-premises environments.
    • Provide functionality that integrates common data cataloging, information sensitivity classification and data obfuscation solutions.
    • Provide the ability to automatically audit data environments to verify compliance with data obfuscation requirements that satisfy organizational standards and regulatory requirements.
    Questions
    • Have data classification standards been established?
    • Have the criteria been documented for data encryption, data masking and tokenization?
    • Have obfuscation techniques been selected and applied in alignment with data-usage requirements?
    • Does the tokenization system prevent reverse translation without access to the mapping content?
    • Are applications compatible with the tokenization systems (applications that require access to detokenized data)?
    • Does the ability exist for controlling reversibility, referential integrity and traceability when obfuscating data?
    • Have quasi-identifiers been obfuscated to protect against linkage attacks?
    • Does functionality exist to identify and notify if sensitive data is not obfuscated?
    Artifacts
    • Data Management Policy, Standard and Procedure – defining and operationalizing the obfuscation of sensitive data
    • Data Management Technology Tool Stack – inclusive of technologies that support the chosen obfuscation techniques
    • Obfuscated Data Access Request Log – a record of events and attempts to access obfuscated data
    Scoring

    Not Initiated

    No formal data obfuscation techniques exist.

    Conceptual

    No formal data obfuscation techniques exist, but the need is recognized, and the development is being discussed.

    Developmental

    Formal data obfuscation techniques are being developed.

    Defined

    Formal data obfuscation techniques are defined and validated by stakeholders.

    Achieved

    Formal data obfuscation techniques are established and adopted by the organization.

    Enhanced

    Formal data obfuscation techniques are established as part of business-as-usual practice with continuous improvement.

    Description

    Data Loss Prevention (DLP)—also known as data leak protection—is a strategy to detect and prevent the deliberate or accidental transfer of sensitive data beyond the network and controls of an organization. An effective DLP program includes directive, preventive, and detective controls to manage data loss for data-at-rest, data-in-motion, and data-in-use. While DLP software tools are an important element of any DLP program, any DLP system must also address the people and process aspects of the organization.

    Objectives
    • Formally establish the DLP strategy and approach within the organization.
    • Define and communicate the roles and responsibilities for the DLP program.
    • Gain approval and adopt DLP policy, standards and procedures that apply consistently across on-premises and cloud environments.
    • Select and implement DLP software tools that align with and support the DLP strategy.
    • Develop and deliver DLP awareness initiatives and training.
    • Measure and continuously improve the effectiveness of DLP measures.
    Advice for Data Practitioners

    A comprehensive DLP strategy must encompass hybrid architectures. Such architectures may include applications that span both cloud and on-premises resources and desktop environments that may be used to access cloud and on-premises resources. To be enduring, develop a DLP program that will scale and encompass an ever-increasing amount and variety of technologies and cloud services. The program must address threats and challenges encountered in cloud environments, such as residency, storage, movement, and data protection.

    A DLP strategy must define business requirements that may include the following:

    • Prevention of deliberate or accidental disclosure of sensitive data.
    • The extent of risk reduction and quantifying the cost of compliance.
    • Compliance with contractual obligations relating to any third-party data.
    • Reduction of reputation and brand risk.
    • Protection of intellectual property.

    A formal DLP policy and supporting procedures are fundamental to the establishment of a DLP program. Practitioners should outline acceptable behavior in policy documentation and enforce this through defined procedures. The program should include supporting incident management and triage capabilities to comply with the principles of zero trust.

    Both the DLP program and any DLP software tools should exploit the cataloging and classification capabilities (refer to CDMC 2.0 Cataloging and Classification) to ensure that the scope of DLP controls are broadly acceptable and evidence can be exhibited for each control. The implementation should also exhibit data security fundamentals, including encryption, obfuscation and access control. Specific DLP control capabilities and processes may include:

    • Enforcing solutions such as encryption, tokenization and obfuscation of data-in-motion, data-at-rest, and data-in-use—according to data classification and handling requirements.
    • Blocking egress traffic from the cloud environment that is excluded from the list of expected domain names.
    • Implementing the ability to block or monitor the transfer between cloud resources and endpoint devices.
    • Monitoring network flow logs for anomalous traffic and connection requests could indicate unauthorized exfiltration of data.
    • Analyzing a cloud API system and application logs to identify unexpected and potentially malicious activity.
    • Validating continuous monitoring for malicious or unauthorized user behavior is in operation.
    • Monitoring of resource configurations to validate compliance against defined policies and standards.

    Practitioners should take advantage of any native DLP capabilities offered by cloud service and technology providers. , Such capabilities include blocking public access to data stores, encryption, private connectivity, threat intelligence and detection of anomalies.

    The DLP program must provide evidence of control coverage and compliance with internal, regulatory, and legal requirements. The program must also ensure that the organization’s staff are aware of individual responsibilities and resources available for mitigating DLP.

    The effectiveness of the DLP program should be reviewed regularly. Measurements of effectiveness should cover policy management, organization coverage, software tool support and automation, communications delivery and training effectiveness.

    Advice for Cloud Service and Technology Providers

    Cataloging and classification capabilities are fundamental in enabling an organization to identify the location of sensitive data elements so that DLP controls can be applied. A provider should provide the organization with the ability to distinguish the various instances of cloud applications. This ability enables the organization to implement different controls for different environments—such as production and development.

    A provider should offer capabilities that enable the interoperability of DLP software solutions across cloud and on-premises environments. With this capability, an organization can perform DLP tasks for multiple environments from a single console. Interoperability should be supported with the adoption of policy language standards and DLP rule specification.

    Providers should also offer capabilities and standardized outputs to correlate disparate events and help identify and investigate possible DLP events.

    DLP risks are lower if organizations can connect to cloud services through private networks—without the need for internet access of public IP addresses. In addition, providers should offer capabilities that permit the organization to identify cloud environments that are not in active use and shut down those environments. Deactivating unused environments also reduces DLP risk.

    Questions
    • Has the DLP strategy and approach for the organization been defined and approved?
    • Have the roles and responsibilities for DLP been defined and communicated?
    • Have the DLP policy and processes been defined and implemented in alignment with the DLP strategy?
    • Have DLP software tools been selected and implemented in alignment and support of the DLP strategy?
    • Have DLP awareness initiatives and training been developed and delivered?
    • Is the effectiveness of the DLP program regularly reviewed and enhanced?
    Artifacts
    • DLP Strategy – detailing the approach for the organization
    • Role Definitions – providing clarity on responsibilities for key roles in the DLP program
    • Data Management Policy, Standard and Procedure – defining and operationalizing DLP
    • Technology Roadmap – for applications aligned to the DLP strategy
    • Communications Plan – specifying the approach to raising awareness of DLP measures and responsibilities
    • Training Plan – identifying and implementing required skills for key roles in the DLP program
    Scoring

    Not Initiated

    No formal DLP program exists.

    Conceptual

    No formal DLP program exists, but the need is recognized, and the development is being discussed.

    Developmental

    A formal DLP program is being developed.

    Defined

    A formal DLP program is defined and validated by stakeholders.

    Achieved

    A formal DLP program is established and adopted by the organization.

    Enhanced

    A formal DLP program is established as part of business-as-usual practice with continuous improvement.

    4.2 A Data Privacy Framework is Defined and Operational

    The organization’s data privacy framework must be updated to address cloud-specific requirements and considerations. Once defined, processes and controls must be established to operationalize the framework.

    Description

    Data privacy encompasses organizations' obligations and requirements and the rights of individuals to manage personal data in a trustworthy, ethical, and compliant manner. It is vital to ensure data privacy is enforced throughout the entire data lifecycle: when it originates, when processed, and when stored—in both on-premises and cloud environments.

    A data privacy framework consists of policies, standards and procedures that collectively ensure the organization meets its business needs, satisfies regulatory obligations, promotes trust, and delivers appropriate, risk-balanced data privacy outcomes. It addresses the people, process, data and technology aspects of these requirements.

    Objectives
    • Ensure the data privacy framework captures cloud-specific requirements and considerations for collecting and processing personal data throughout the data lifecycle.
    • Review and refine the roles and responsibilities for data privacy, with considerations for cloud environments.
    • Define controls and processes that govern data usage within the cloud environment and relate directly to policy in the data privacy framework.
    • Define processes for regular assessment of the design and operating effectiveness of the data privacy framework and supporting controls.
    • Align privacy processes and supporting technology with the collection and processing activities in the cloud environment.
    Advice for Data Practitioners

    Data practitioners should develop, extend, and maintain a data privacy framework that accommodates all organization's privacy requirements, themes, and programs. The owner of the data privacy framework should be identified, and its scope should be defined. The framework should include a plan for managing data privacy risk, address data privacy communication and training for the organization, and specify the terms and frequency of privacy impact assessments.

    Privacy requirements

    Privacy requirements will vary from one organization to another. The requirements vary according to the types of personal data collected and processed, the purposes for which personal data is collected and processed, and the industry and jurisdictional footprint of the organization.

    Privacy requirements can come from various sources, including:

    • Privacy and data protection laws, regulations, court rulings, and supervisory authority guidance.
    • Society and industry expectations, best practices and norms.
    • Internal factors, such as company values and risk tolerance.
    • Third-party stakeholders, such as customers, shareholders, vendors, and business partners.

    Privacy themes

    Across all aspects of a data privacy framework, one or more themes can drive privacy requirements. These themes include but are not limited to transparency, choice, individual rights, sharing, data breach notice, retention, and deletion. While global privacy laws and regulations vary, the privacy themes within them are generally consistent. By distilling complex (typically global) privacy requirements into privacy themes, an organization can more easily interpret privacy requirements for effective and efficient privacy programs. A data privacy program must provide transparency for governance, policies, notices, roles, operations, monitoring, testing, and reporting to ensure that these are operationalized in an effective and compliant manner.

    Privacy program

    An organization implements a data privacy framework through a data privacy program, which primarily includes accountability, governance, and oversight roles and mechanisms. A Chief Privacy Officer—together with a privacy committee—ensures that monitoring, testing, measurement, reporting, escalations and periodic audits occur at the proper intervals.

    A data privacy program also generates and maintains documented policies, standards and procedures. All program documentation should articulate privacy requirements in clear language that stakeholders in the organization readily understand. This documentation defines all the privacy program operations, including data discovery and mapping, notice drafting, and deployment. Program documents should also specify the collection and implementation of privacy choices and processing of data subject requests. In addition, program documentation should also include clear explanations of roles and responsibilities, especially who is to be accountable and responsible (RACI) for each privacy program task. Key roles include the Chief Privacy Officer, privacy team, risk, compliance, business, operations, legal and audit.

    Privacy tasks are implemented in compliance with policies and standards through operations and supporting technology—including the operational functions such as data discovery and mapping, data classification, notice deployment, consent/preference management, privacy settings in applications and websites, data subject requests, data deletion, and data breach notification. Where possible, look for opportunities to simplify and centralize privacy operations, minimizing overlap, redundancy and lengthy notices, multiple privacy-choice delivery channels and processing databases, conflicting or unconnected data subject-rights processes, and data breach notifications. Simplifying and streamlining privacy operations and supporting technology can increase efficiency, reduce costs, ease compliance burden, and mitigate risk.

    Owner of the data privacy framework

    Many roles across an organization will need to consider privacy regularly, and privacy concerns may be substantial for some roles. It is vital to identify an accountable senior executive that leads the definition, development and management of the data privacy framework. This role is often titled the Chief Privacy Officer and should be someone with a high level of data privacy expertise—ideally with certifications. Privacy accountability throughout an organization should be measured with direct oversight from the data privacy program.

    An important task in a data privacy program is that the senior privacy executive periodically publish a data privacy report structured around key privacy metrics and corresponding measurements. The executive should present the report to other senior management, thereby communicating the effectiveness and the health of the data privacy program.

    Scope of a data privacy framework

    A data privacy framework must envelop the entire organization, especially all the personal data that the organization collects and processes—throughout the data lifecycle. An organization should consider implementing technology that supports the effective implementation of the data privacy framework through the data privacy program.

    It is necessary to establish data discovery and mapping in all areas where personal data is collected and processed to identify and capture all personal data touchpoints—throughout the data lifecycle—across the organization.

    Data privacy risk management

    Data practitioners should adopt and apply a privacy-by-design and a risk-based approach to implementing a data privacy framework. An organization should develop and periodically review its privacy risk appetite. As appropriate, it is important to update the data privacy framework to reflect risk appetite—including privacy requirements, privacy themes, and all other privacy program elements.

    The data privacy framework and the privacy program should be assessed against industry standards aligned with best-practice guidance. Widely implemented standards include the AICPA Privacy Management Framework, SOC 2 Privacy Controls, ISO 27701, Data Protection Management Program (DPMP), HITRUST, and NIST privacy controls.

    Communication and training

    Ensure clarity and consistency for privacy notices in all business units, especially notices about the same individual's personal data. Privacy practices should be clear and consistent to anyone who uses an application, website, or social media space to engage with the organization.

    Also, ensure that all relevant departments are proportionally represented in each privacy process—especially for marketing, data subject rights, deletions and data breach notices. To properly implement the data privacy framework, it is essential to cultivate strong connections from the Privacy team to the Information Security, Data management and Records Management teams. In particular, these connections are important to maintain consistency in data privacy definitions and classifications.

    To cultivate an organizational culture that highly values data privacy, implement suitable training across the organization—especially for management and key roles such as marketing or privacy operations.

    Privacy impact assessment

    Practitioners should work with key stakeholders to define and implement an effective Privacy Impact Assessment process that includes efficient assessment criteria, tasks, technology and the events that trigger when an assessment is necessary. For example, it may be necessary to conduct a Privacy Impact Assessment when a large amount of sensitive personal data is processed or personal data migrates across jurisdictions.

    Also, ensure that Privacy Impact Assessments are designed and implemented with a strong emphasis on data ethics. Refer to CDMC 3.2 - Ethical Access, Use, & Outcomes of Data Are Managed. Finally, include cross-border data movement triggers and clearance questions in the Privacy Impact Assessment, as appropriate. Refer toCDMC 1.4 - Data Sovereignty and Cross-Border Data Movement are Managed.

    Advice for Cloud Service and Technology Providers

    Cloud service and technology providers should support flexible metadata tagging of personal data that accommodates multiple jurisdictions with various definitions of personal data. In addition, a cloud environment data catalog should readily capture various legal constructs, data classifications, usage categories and retention policies of its organizations. Practitioners should verify that only authorized users can access such metadata.

    Technologies should be in place to support integration systems that manage records processing. There should be a clear linkage among all processing activities, applications, legal entities and data. Providers must offer support for the consumption and retention of usage, notice, and privacy choice information.

    In addition, providers should offer the ability for an organization to comply with data subject-rights requests, including access and deletion rights.

    Questions
    • Have roles and responsibilities for Data Privacy been reviewed and refined with considerations for the cloud?
    • Does the data privacy framework—including the Privacy Program, policies and procedures—capture cloud-specific requirements and considerations for collecting and processing personal data throughout the data lifecycle?
    • Have processes been defined that regularly assess the design, operating effectiveness and supporting controls of the data privacy framework?
    • Have operational privacy processes and supporting technology been aligned to new collection and processing activities within the cloud environment?
    Artifacts
    • Data Privacy Framework – reflecting requirements for the cloud environment with a detailed summary of roles and responsibilities and requirements mapped to controls
    • Data Management Procedure – for regular reviews and updates of the data privacy framework to ensure the procedures and capabilities address changing requirements and identified shortcomings
    Scoring

    Not Initiated

    No formal data privacy framework has been defined.

    Conceptual

    No formal data privacy framework has been defined but the need is recognized, and its development is being discussed.

    Developmental

    Definition of the formal data privacy framework is being developed.

    Defined

    The formal data privacy framework is defined and validated by stakeholders.

    Achieved

    Definition of the formal data privacy framework is established and adopted by the organization.

    Enhanced

    Definition of the formal data privacy framework is established as part of business-as-usual practice with continuous improvement.

    Description

    The organization must show evidence that the defined data privacy framework has been defined and is operational as business-as-usual.

    Management of personal data using protection and privacy controls includes frequent consideration of the what, where, and why for this data, the cataloging and classification, and its various uses. Putting a framework into operation should be done according to a set of clear, documented metrics that quantify how the controls ensure compliance with the data privacy framework objectives. All metrics and subsequent operational measures should be readily available to stakeholders.

    Objectives

    Each of these objectives must be met to operationalize a best-practice data privacy framework:

    • Implement clear data collection and intent-of-use notifications throughout the data lifecycle.
    • Create, implement and continuously improve the processes for personal data discovery, classification and inventory maintenance.
    • Identify and operationalize Privacy Enhancing Technologies (PETs) and data security controls in all data environments to ensure adequate protection of personal data.
    • Enhance each PET to improve the automation of preventative data process controls that help to mitigate data subject risks and privacy risks.
    • Implement processes that support data traceability, data lineage, and auditing that produce evidence for the usage and provenance of personal data according to each data subject's privacy framework and preferences.
    • Establish clear processes, procedures and mechanisms (automated wherever possible) for receiving and responding to requests from data subjects and inquiries from regulators regarding the use of personal data.
    • Define and cultivate a privacy-by-design culture across the organization, in each data management platform and domain and any reengineering and modernization effort.
    Advice for Data Practitioners

    A data privacy framework consists of policies driven by the objectives and active controls that underpin the best practices of managing data that must remain private. These policies should reflect the risk appetite of the organization that implements the framework.

    The complete set of controls that an organization implements in support of its privacy framework should support the objectives given in this sub-capability. Naturally, a specific control may support more than one framework theme.

    A control employs one or more technical constructs that people, policies, standards and procedures have specified to ensure that data privacy operations comply with the framework. Typically, the complete set of controls covers various functions for collecting and recording, using, maintaining, reporting and sharing personal data. Each control should adhere to the known preferences and rights of the data subject.

    Data practitioners can implement the policies in the data privacy framework effectively by concentrating on the following areas.

    Privacy notices and consent management

    Privacy notices must be readily accessible and written to be understood easily by all data subjects. Practitioners should be explicit about the various uses of personal data and regularly review each use case.

    Align privacy consent management models to applicable regulatory requirements. Then, ensure that processes for capturing consent, distributing intent-of-use notices and demonstrating transparency accommodate any jurisdictional limitations and communicate the nature and extent of compliant behaviors.

    Regarding capturing consent for data-use and privacy notices, document all aspects of coordination and the complete set of responsibilities for controllers and processors. Wherever applicable, provide each data subject with methods for withdrawing and adjusting consent for personal data.

    Classification and cataloging

    Effective implementation of the data privacy framework is dependent on the capabilities detailed inCDMC 2.0 Cataloging and Classification.

    For every use case in each environment, establish risk assessment methods that demonstrate a balance in managing data protection risks and the value of personal data. Data classification and categorization approaches must take into account the methods of data access on various cloud platforms. Take care to explicitly address multiple classification levels while simultaneously anticipating the extent of data proliferation in cloud environments. Any data classification capability must support jurisdictional and regulatory hierarchies, layers and intersections of control structures and support complex personal data definitions.

    Document clear definitions, responsibilities and the coordination necessary for discovering, classifying and categorizing personal data, taking into account various capabilities for data origination, transformation, storage and disposition. Adopt technology enhancements for comprehensive management of personal data. An example is data analysis systems that identify and classify data assets. Discovery tools should continuously accommodate new data types, data structures and data storage environments. These tools should integrate seamlessly together to minimize errors and manual intervention.

    Shared responsibility

    Controllers and processors should establish clear guidelines for explicitly assignable actions for direct collection and management of personal data of all classifications. For each data store, process and user type, documentation should exist that outlines the responsibilities for managing and collecting personal data. For personal data stores and processes, define and document the best configuration for data security and de-identification. Most cloud service providers offer expertise in data security, breach detection, mitigation and response.

    Controllers and processors must agree on responsibility boundaries for capturing, recording, and reporting each type of data use. Also, each controller and processor should publish the controls each can provide, such as those driven by policy or governance, data topological-access controls and manual enforcement.

    Privacy enhancing technologies

    Privacy enhancing technologies (PETs) aim to reduce privacy risks associated with data processing. They are sometimes called privacy enhancing techniques or privacy preserving technologies (PPTs). Generally, these technologies protect data by manipulating, replacing, concealing, or perturbing the original data, making it extremely difficult to reidentify. Common techniques include categorization, tokenization and encryption, data masking and anonymization. Refer to CDMC 4.1 Data is Secured and Controls are Evidenced for additional information and advice for protecting information using these techniques together with conventional security controls.

    A PET should accommodate the policies in the data privacy framework and provide capabilities for logically and physically organizing both protected and de-identified data and for addressing jurisdictional requirements. PETs should support flexible, secure de-identification and anonymization capabilities. With conventional data security and access controls, a PET should protect personal data for a wide variety of use cases, each of which may manage data with different risk tolerance levels. Compile requirements and define processes for maintaining data privacy through the PETs. When practical, automate the transfer of personal data in and out of a cloud data management platform.

    Integrate each PET with data privacy management software and tools already in use by the organization. Data management stakeholders and processes should have selective access to the various features of each PET, with a clear view of the data protection requirements that are necessary for each specific use case. A new data type, storage type or process may require additional flexibility and extensibility from one or more PETs to integrate with native data management systems and comply with governance policies.

    Risk assessments involving personal data re-identification should consider exploiting high-volume processing efficiencies available in most cloud computing environments. Any personal data risk assessment should involve every applicable jurisdiction and organize the assessment according to each jurisdiction. Risk assessments should include outlier analysis, hidden/surrogate identifiers, linkage attacks involving publicly available personal data and transactional uniqueness. Such capabilities are typically not possible with conventional on-premises and built-for-purpose systems.

    Data processing

    Practitioners should seek to craft each personal data capture process precisely and regularly assess each for proper compliance. New and evolving data ingestion, access and manipulation tools will likely require integration with data discovery and cataloging processes. These tools should support the careful identification and management of new origination sources of personal data collection.

    Be alert to the introduction of novel data collection and processing techniques, which are part of the value proposition of cloud platforms. These techniques often entail greater complexity to support sufficient processing requirements. Novel approaches to personal data capture should be well-integrated into data classifications and categorizations that comply with data subject consent agreements.

    Data practitioners should employ data privacy disclosure controls to manage the exchange of personal data between cloud environments, jurisdictions and data domains. Examples of disclosure controls include minimization, consent and data protection. Also, take care to establish a detailed Record of Processing Activities and ensure each use case has a firm legal basis.

    Data movement and data lineage

    Data movements may include minimal decision-making constraints and, consequently, data lifecycle management policies must account for additional copies of data. Perform analysis to balance lower-friction data movement and storage costs with data collection obligations—especially if some of the data may be virtualized or subject to a legal hold.

    Tracking and lineage for data movement should support core jurisdictional guidance by providing evidence of compliance. Pay special attention to data-sharing use cases, even assuming complete anonymization processes for data de-identification are in place. Optimally, it should be possible to block or notify on any non-compliant data use.

    Data subject requests

    To coordinate data subject interface touchpoints and data auditing, insist on simple and timely responses to the data subject and regulator requests. Be prepared to accommodate data subject requests that originate from various jurisdictions or across multiple jurisdictions.

    Coordinate new and evolving data processing from various cloud platforms to produce a timely response to data subjects and regulatory agencies. Different types of data will require different types of protection from re-identification. It is important to assess the risks of novel and evolving protections carefully. To respond quickly to personal data-use requests, consider additional automation as personal data proliferates across multiple cloud platforms.

    Advice for Cloud Service and Technology Providers

    To support an organization effectively as it implements its data privacy framework, cloud service and technology providers should consider the following.

    Privacy notices and consent management

    Build adequate support for new sources of personal data collection and processing. Likewise, make corresponding optimize the management of data subject consent and notices and make any necessary adjustments to jurisdictional processing. Provide documentation that helps organizations define the roles and execute the responsibilities of controllers and processors.

    Classification and cataloging

    As detailed in CDMC 2.0 Cataloging & Classification build and support the deployment and use of automated data classification and categorization schemes that embrace the methods of easy data access and data sharing in cloud platforms. These schemes should support new, complex data types and drive data protection, complex classification and data-use hierarchies, lineage capture and controls for managing replication and proliferation. Document responsibilities of both the organization and the cloud service provider.

    Shared responsibility

    Ensure that documentation emphasizes accountabilities for personal data management by controllers and processors—and establish criteria for agreeing to the responsibilities for capturing, recording and reporting on data use. Collaborate with organizations to understand and document personal data stores, processes and the best data security and de-identification configurations.

    Privacy enhancing technologies

    Provide privacy-by-design guidelines to organizations that support a continuously expanding array of data storage, processing and analytical capabilities. Create flexible, consistent, outcome-oriented designs that accommodate complex data and new data types. Support the establishment of cloud computing standards for data security and data privacy.

    Offer the latest technology to support data privacy methods such as de-identification and anonymization. Integrate each PET with the flexibility to support data protection for a variety of use cases. Innovate to provide options that automate complex jurisdictional processing of personal data.

    Processing

    Provide controls that execute automatically during data transfer operations. Support all organization applications, ensuring the correct implementation of data-use policies and monitor for compliance.

    Questions
    • Have data collection and intent of use notifications been implemented throughout the data lifecycle?
    • Have processes been established for personal data discovery, classification and inventory?
    • Have PETs been put to use in all organization data environments to protect personal data?
    • Have PETs been enhanced to improve the automation of preventative data process controls that help to mitigate data subject risks and privacy risks?
    • Is there a process to provide evidence for the usage and provenance of personal data across the organization?
    • Are processes in place to receive and respond to data subject requests and inquiries from regulators?
    • Has the organization embraced a privacy-by-design culture? If so, is there verifiable evidence for how the cultural expectations have been communicated, the requirements the organization is expected to meet and how the implementation of those requirements will be confirmed?
    Artifacts
    • Data Management Procedures – for the execution of privacy requests and inquiries
    • Data Privacy Notification Catalog
    • Data Catalog Report – evidencing the discovery and classification of personal data across all environments
    • PET Catalog – listing the PETs supported in the organization and summarizing their capabilities (including the extent to which they automate preventative controls
    • Data Lineage Reports – evidencing the provenance and use of personal data
    • Privacy-by-design Principles & Guidelines
    Scoring

    Not Initiated

    The data privacy framework is not operational in cloud environments.

    Conceptual

    The data privacy framework is not operational in cloud environments, but the need is recognized, and the implementation is being discussed.

    Developmental

    Implementation of the data privacy framework in cloud environments is being planned.

    Defined

    TImplementation of the data privacy framework in cloud environments has been vaidated by stakeholders.

    Achieved

    The data privacy framework is operational in cloud environments.

    Enhanced

    Operation of The data privacy framework is established as part of business-as-usual practice in cloud environments with continuous improvement.

    4.3 Protection & Privacy – Key Controls

    The following Key Controls align with the capabilities in the Protection & Privacy component:

    • Control 9 – Security Controls
    • Control 10 – Data Protection Impact Assessments

    Each control with associated opportunities for automation is described inCDMC 7.0 – Key Controls & Automations.

    Control 9: Security Controls

    Component

    4.0 Protection & Privacy

    Capability

    4.1 Data is Secured, and Controls are Evidenced

    Control Description
    1. Appropriate Security Controls must be enabled for sensitive data.
    2. Security control evidence must be recorded in the data catalog for all sensitive data.
    Risks Addressed

    Data is not contained within the parameters determined by the legislative, regulatory or policy framework where the organization operates. Data loss or breaches of privacy requirements resulting in reputational damage, regulatory fines and legal action.

    Drivers / Requirements

    The sensitivity level of the data dictates what level of encryption, obfuscation and data loss prevention should be enforced. The requirements for Security Controls and Data Loss Prevention become increasingly more stringent as the sensitivity level of the data increases.

    Legacy / On-Premises Challenges

    It is difficult to ensure that encryption is always on for sensitive data.

    Automation Opportunities
    • Provide security controls capabilities including encryption, masking, obfuscation and tokenization that are turned on automatically based on the sensitivity of a data set.
    • Automate recording of the application of security controls.
    Benefits

    Evidence that the appropriate level of encryption is on and has been consistently applied is easy to produce.

    During a security audit, a data owner has a list of their data and how much of it is sensitive. Every piece of sensitive data can provide evidence that the data is encrypted, and there is a data loss prevention regime in place for all the compute environments it resides.

    Having security control evidence to deliver through the catalog rather than performing a forensic cyber review is a cost savings opportunity. A full-time team of employees typically handles this work.

    Summary

    Automation that enforces and records the appropriate encryption level based on a data asset’s sensitivity level ensures security compliance and reduces manual effort to provide evidence of the controls.

    Control 10: Data Protection Impact Assessments

    Component

    4.0 Protection & Privacy

    Capability

    4.2 A Data Privacy Framework is Defined and Operational

    Control Description

    Data Protection Impact Assessments (DPIAs) must be automatically triggered for all personal data according to its jurisdiction.

    Risks Addressed

    Data is not secured to an appropriate level for the nature and content of that data set. This results in either data being secured at greater cost and inconvenience than required or data loss or breaches of privacy requirements resulting in reputational damage, regulatory fines and legal action.

    Drivers / Requirements

    If a data set is classified as containing personal information, an organization needs to be able to demonstrate that it has performed a data protection impact assessment on it in certain jurisdictions.

    Legacy / On-Premises Challenges

    It is a very expensive workflow to initiate and complete a data protection impact assessment for the data assets classified as containing personal information.

    Identifying the DPIAs that need to be performed can be challenging, and completing those DPIAs can be very expensive.

    Automation Opportunities
    • Automatically initiate Data Protection Impact Assessments based on factors such as the geography of the data infrastructure, classification of the data or the specified consumption purpose.
    Benefits

    Evidence that all privacy requirements have been met for sensitive data is easy to produce since DPIAs are automatically initiated.

    Cost savings opportunities arise from more efficient identification of the need for DPIAs.

    Summary

    Automatically enforcing a DPIA on data that is classified as personal ensures policy compliance and reduces manual labor costs for that function.

    Open Next Component

    Leave a Reply

    Be a thought leader, share your best practice with other industry practitioners. Join the DCAM User Group or the CDMC Interest Group (or both). Then share this invitation with your fellow members - let’s get the crowd moving.
    Join the Crowd

    Agree to Terms of Use

    Terms of Use

    Please agree to the CDMC Terms of Use before accessing the CDMC Framework or Key Controls.

    This document is a constituent part of the Cloud Data Management Capabilities (CDMC™) model (“the Model”) and is provided as a free license to any organization registered with EDM Council Inc. (“EDM Council) as a recipient (“Recipient”) of the document. While this is a Free License available to both members and non-members of the EDM Council, acceptance of the CDMC Terms of Use is required to protect the Recipient’s use of proprietary EDMC property and to notify the Recipient of future updates to the Model.

    CDMC™ and all related materials are the sole property of EDM Council Inc. (“EDM Council”). All rights, titles and interests therein are vested in the EDM Council. The Model and related material may be used freely by the Recipient for their own internal purposes. It may only be distributed beyond the Recipient’s organization with prior written authorization of EDM Council. The Model may only be used by the Recipient for commercial purposes or external assessments if the Recipient’s organization has entered into a separate licensing and Authorized Partner Agreement with EDM Council governing the terms for such use.

    By clicking the box below, you agree these Terms are a binding agreement between you and the EDM Council. Your use of the EDM Council site is governed by the version of the Terms in effect on the date accessed by you. EDM Council may modify these Terms at any time but will provide reasonable notice if possible. These Terms are in addition to any other agreements you may have between you and EDM Council, including any member or Website Terms of Use for members and guests, and any other agreements that govern your use of information, content, tools, products, and services available on and through the EDM Council site.