DCAM Framework – 5.0 Data Quality Management

Search

Component 5

Introduction

The Data Quality Management function defines the goals, approaches and plans of action that ensure data content is of sufficient quality to support defined business and strategic objectives of the organization. The function should be developed in alignment with business objectives, measured against defined data quality (DQ) dimensions and based on an analysis of the current state of DQ. Data Quality Management is a series of processes across the full data supply chain to ensure that the data provisioned meets the needs of its intended consumers.

DQ requires an understanding of how data is sourced, defined, transformed, provisioned and consumed. DQ is not a process itself but describes the degree in which data is fit-for-purpose for a given business process or operation.

Definition

The Data Quality Management (DQM) component is a set of capabilities to define data profiling, DQ measurement, defect management, root cause analysis and data remediation. These capabilities allow the organization to execute processes across the data control environment ensuring that data is fit for its intended purpose.

Scope

  • Establish a DQM function within the Office of Data Management (ODM).
  • Work with data management (DM) Program Management Office (PMO) to design and implement sustainable business-as-usual processes and tools for DQM.
  • Execute DQM processes against business-critical data. DQM processes include profiling & grading, measurement, defect management, root cause fix, remediation.
  • Establish DQ metrics and reporting routines.
  • Ensure that the DQM governance is integrated into the Data Governance (DG).

Value Proposition

Organizations that build, formalize and assign DQ responsibilities into daily routine and methodology achieve a sustainable organization-wide data culture.

Organizations that effectively implement Data Quality Management and achieve the appropriate level of DQ across the data ecosystem get a return on investment from several areas:

  • Better risk management
  • Enhanced analytics
  • Better client service and product innovation
  • Improved operational efficiencies

Overview

DQ is a broad conceptual term that needs to be understood in the context of how data is intended to be used. Perfect data is not always a viable objective. The quality of the data needs to be defined in terms that are relevant to the data consumers to ensure that it is fit for its intended purpose. The overall goal of DM is to ensure that data consumers have confidence in the data they receive. These consumers are using this data to support their business functions. For them to make accurate decisions the data must reflect the facts the data is designed to represent without the need for reconciliation or manual transformation.

The organization needs to develop a DQM strategy and establish the overall plans for managing the integrity and relevance of its data. One of the essential objectives is to create a shared culture of DQ stemming from executive management and integrated throughout the operations of the organization. To achieve this cultural shift, the organization must agree on both requirements and the measurement of DQ that can be applied across multiple business functions and applications. This will enable business sponsors, data producers, data consumers and technology stakeholders to link DQ management processes with objectives.

DQ can be segmented into dimensions:

  • Accuracy: the relationship of the content with original intent
  • Completeness: the availability of required data attributes
  • Coverage: the availability of required data records
  • Conformity: alignment of data content with required standards
  • Consistency: how well the data complies with the required formats/definitions
  • Timeliness: the currency of content representation as well as whether the data is available/can be used when needed
  • Uniqueness: the degree that no record or attribute is recorded more than once

The identification and prioritization of data quality dimensions foster effective communication about DQ expectations and are an essential prerequisite of the DM initiative.

Creating a profile of the current state of DQ is an important aspect of the overall DQM function. A new profile should be created periodically when data is transformed. The goal is to assess patterns in the data as well as to identify anomalies and commonalities as a baseline of what is currently stored in databases and how actual values may differ from expected values. Once the data profile is established, the organization needs to evaluate the data against the quality tolerances and thresholds defined by the DQ requirements. The evaluation also examines business requirements to validate that the data is fit-for-purpose.

The purpose of this evaluation process is to measure the quality of the most important business attributes of the existing data and to determine what content needs remediation. A responsibility of the data producer and data consumer is to identify the data that is critical to the data consumer’s business process. Prioritizing the data based on criticality then informs the DQM function which attributes require a heightened level of control and quality review. The designation of criticality requires that the highest level of accuracy and DQ treatment is applied. The assessment process identifies the data that needs to be cleansed to meet data consumer requirements. Data cleansing should be performed against a predefined set of business rules to identify defects that can be linked to operational processes.

Data cleansing should be performed as close to the point of capture as possible. There should be clear accountability and a defined strategy for data cleansing to ensure that cleansing rules are known and to avoid duplicate cleansing processes at multiple points in the data lifecycle. The overall goal is to clean data once at the point of data capture based on verifiable documentation and business rules as well as to fix the processes that allowed defective data into the system at the root cause. Data corrections must be communicated to, and aligned with, all downstream repositories and upstream systems. It is important to have a consistent and documented process for issue escalation and change verification for both data producers and data vendors.

It is also important to ensure that data meets quality standards throughout the lifecycle so that it can be integrated into operational data stores. This aspect of the DQ management process is about the identification of data that is missing, determination of data that needs to be enriched and the validation of data against internal standards to prevent data errors before data is propagated into production environments.

For DQ to be sustained, a strong governance structure with the highest level of organizational support from senior executive management must be in place. This supports the DQM activities and ensures compliance to DQ processes. DQ processes need to be documented, operationalized and routinely validated via DM reviews and formal audit processes.

DQ cannot be achieved through central control. Organization-wide DQ requires the commitment and participation of a broad set of stakeholders. DQ is the result of a series of business processes creating a data supply chain. Therefore, stakeholders, along that chain must be in place, authorized and held responsible for the quality of data as it flows through their respective areas. DQ requires coordinated organizational support. DQM processes and objectives must be part of the operational culture of an organization for it to be sustained and successful.

Core Questions

  • Is it understood that poor quality data is an indication of a broken business process or technology?
  • Is it understood that instituting a DQ system is a cultural shift that touches all aspects of business, operations and technology processes?
  • Is the required training in place to sustain the DQM function?
  • Are the necessary people and funding resources earmarked to implement and operate the DQM function?
  • Are the necessary resources in place to provide organization-wide training to support a sustainable, DQ cultural change?

Core Artifacts

The following are the core artifacts required to execute an effective Data Quality Management capability. Items with an ‘*’ link to published best practice guidelines.

  • Critical Data Element Criteria
  • Data Profiling Methodology
  • Data Quality Dimensions Framework*
  • Data Quality Metrics & Dashboards
  • Data Quality Rules Inventory
  • Defect Management Methodology
  • Root Cause Analysis Methodology

The DQM function strategy and approach must be defined and approved by stakeholders. Roles and responsibilities across the stakeholders must be established with operational processes in place and auditable.

Description

The strategy and approach must be defined for the DQM function and reflect the related vision and objectives of the Data Management Strategy (DMS). Once established, it must be formally empowered by senior management and its role communicated to all stakeholders.

Objectives
  • Formally establish the DQM strategy and approach within the organization.
  • Get approval of the DQM strategy and approach from stakeholders.
  • Ensure alignment of stakeholder plans and roadmaps with the DQM strategy and approach.
  • Obtain executive management support for the DQM strategy.
  • Communicate the role of the DQM function across the organization through formal channels.
  • Operate the DQM function collaboratively with DM initiative stakeholders.
  • Secure authority to enforce DQM compliance through policy and documented procedure.
Advice

The DQM strategy and approach encompasses the what, how and who of DQ. It needs to address what scope of data is to be scrutinized and reviewed; how the DQ assessments will be performed with metrics defined; and who will be responsible with defined roles. DQM needs to be closely aligned with the organization’s business objectives to ensure that the most important data is properly maintained and monitored. DQM involves cultural change. It is critical that a documented DQM strategy and approach is socialized with business, data and technology stakeholders to ensure awareness, support and commitment.

The rapidly evolving focus on data ethics is introducing new requirements for the DQM function. These requirements include an ethical review as part of determining that data produced is fit-for-purpose. Additionally, DQM is one of the areas where the use of Machine Learning (ML) and Artificial Intelligence (AI) may assist in the processes used to achieve quality data. These requirements and opportunities should be evaluated in the strategy and approach of the DQM function.

Alignment of the DQM strategy and roadmap to the DMS vision and objectives is achieved by agreement between the operating level data officer and the individual responsible for delivering the DG function. The operating level data officer is accountable for establishing priorities across each of the Framework Component requirements.

Questions
  • Has the DQM function been formally established?
  • Is there a DQM strategy and approach in place?
  • Is the DQM strategy and roadmap aligned to the DMS?
  • Have innovative technologies such as ML and AI been considered as part of the DQM process and infrastructure?
  • Has the review of data ethics been included in the DQM strategy and approach?
  • Has the DQM function been formally communicated to business, technology, operations, finance and risk stakeholders?
  • How has executive management demonstrated its support?
  • Has authority been granted to the DQM function to implement and enforce best practice via policy and standards?
  • Has authority been communicated to stakeholders?
  • Is there a functional partnership in place with Internal Audit?
Artifacts
  • The DQM plan
  • Description of the roles and responsibilities of the DQM function
  • Communication of specific support from executive management with distribution lists
  • Policies and procedures associated with executing and enforcing DQM
  • Bi-directional engagement with stakeholders on the DQM function authority
Scoring

Not Initiated

No formal DQM strategy exists.

Conceptual

No formal DQM strategy exists, but the need is recognized and the development is being discussed.

Developmental

The formal DQM strategy is being developed.

Defined

The formal DQM strategy is defined and has been validated by the directly involved stakeholders.

Achieved

The formal DQM strategy is established and understood across the organization and is being followed by the stakeholders.

Enhanced

The formal DQM strategy is established as part of business-as-usual practice with a continuous improvement routine.

The strategy and approach are reviewed and updated at least annually.

Description

DQM requires a network of data stewards and subject matter experts to ensure data is properly captured, processed and delivered. Accountable parties must be identified and the roles and responsibilities must be clearly communicated.

Objectives
  • Define and communicate the roles and responsibilities of the DQM function.
  • Fund and staff the DQM function.
  • Ensure and enforce alignment of activities and projects to policy and standards through the authority of the DQM function.
  • Hold individuals accountable for the DQM performance via annual reviews and compensation considerations.
Advice

DQM involves numerous stakeholders who are responsible for data requirement capture, data profiling, remediation, definitions, metadata, transformation, root cause analysis, entitlement control and coordination across the full data ecosystem. These efforts involve the assignment and empowerment of owners, stewards, curators and custodians. These accountable parties need to be at the right levels of seniority as well as understand all the internal processes associated with DQM.

With the addition of a data ethics review in the DQM process, subject matter expertise will be required either through the addition of experts or appropriate training of the data stewards. Similarly, to the extent that ML and AI are used to support the DQM process, additional skills will need to be added or developed within the stakeholders.

Questions
  • Has the DQM function been established?
  • Is the DQM function appropriately staffed and funded?
  • Does the DQM function have the authority needed to be effective?
  • Have the roles and responsibilities of the DQM function been defined, documented and socialized?
  • Have the skills for data ethics review and execution of ML and AI tools been added or developed within the stakeholders?
  • Have milestones and metrics associated with the DQM function been established?
Artifacts
  • Evidence of stakeholder identification
  • RACI matrix or other evidence of accountability assignment
  • Description of the roles and responsibilities of the DQM function
  • Staff assignments and qualifications
  • Evidence of accountability linked to performance reviews and compensation
  • Gap analysis of skills needed and in place
  • List of stakeholders and evidence of bi-directional communication
Scoring

Not Initiated

No formal DQM roles & responsibilities exist.

Conceptual

No formal DQM roles & responsibilities exist, but the need is recognized and the development is being discussed.

Developmental

The formal DQM roles & responsibilities are being developed.

Defined

The DQM roles & responsibilities are defined and have been validated by the directly involved stakeholders.

Achieved

The DQM roles & responsibilities are established and are recognized and used by stakeholders.

Enhanced

The DQM roles & responsibilities are established as part of business-as-usual practice with a continuous improvement routine.

The roles & responsibilities are reviewed and updated at least annually.

Description

Formal processes must be established for the activities of the DQM function. These processes align with the DM policy and standards of the organization and include procedures, tools and routines. The routines are required for steady-state operations.

Objectives
  • Establish formal DQM processes in alignment with the DM policy and standards.
  • Integrate the DQM processes into the overall end-to-end processes of the DM initiative.
  • Identify, schedule and maintain DCE routines, meetings and working sessions required for operational support.
Advice

The DQM subject matter experts should work with the business process design and optimization service within the Data Management Program (DMP) team. Together they will create and monitor the implementation of the DQM processes in alignment to the end-to-end process across the full DM initiative.

The DQM process design should include the requirements for ethical review as part of determining the data is fit-for-purpose. The design should also incorporate ML and AI into the process if included in the DQM strategy and approach.

Questions
  • Have formal processes been defined and implemented?
  • Are the procedures, tools and routines in place for implementing the processes?
  • Have innovative technologies such as AI and ML been considered as part of the DQM process and infrastructure?
  • Has the review of data ethics been included in the DQM strategy and approach?
  • Are DQM activities part of the normal operational routine of stakeholders?
  • Are there standing meetings, planning sessions and regular communications about data initiatives?
Artifacts
  • Process design artifacts, procedure guides and published routines
  • Process performance metrics reports
  • Meeting minutes, status reports and DMP announcements
Scoring

Not Initiated

No formal DQM operational processes exist.

Conceptual

No formal DQM operational processes exist, but the need is recognized and the development is being discussed.

Developmental

The DQM operational processes are being developed.

Defined

The DQM operational processes are defined and have been validated by the directly involved stakeholders.

Achieved

The DQM operational processes are established and are recognized and used by stakeholders.

Enhanced

The DQM operational processes are established as part of business-as-usual practice with a continuous improvement routine.

Description

DQ auditing must occur on three levels:

  • Quality Assurance (QA): the accountable business performs self-assessments based on defined DQ thresholds, processes, and objectives.
  • Quality Control (QC): the DM initiative performs a facilitated audit of the accountable business’ DQ and processes and is empowered to force the business to remediate any gaps found to ensure adherence to DQ thresholds and standards.
  • Internal Audit: the accountable business’ DQ and processes are subject to audits. Failure to satisfy this review may result in formal escalated audit findings written against the business.
Objectives
  • Data Stewards have performed self-assessment of the accountable DQ and processes (QA).
  • The DM initiative has performed facilitated assessments of the accountable business DQ and processes (QC).
  • The DM initiative is empowered to force business teams to remediate gaps found in the operational DQ processes.
  • Internal Audit performs routine examinations of the accountable business DQ and processes.
  • Formal audit Issues are generated if operational gaps are uncovered.
Advice

DQ processes, validation, root cause analysis, remediation, etc., should be routinely audited. Audit occurs on three levels. First, self-attestation – where the stakeholders evaluate and assert they are following the data quality rules. Second, through the Office of Data Management (ODM) – where the DM initiative works with stakeholders to validate compliance. Third, through internal review where Internal Audit has formally validated that processes are being followed.

Questions
  • What are the mechanisms to ensure validation, root cause analysis, and remediation?
  • Is Internal Audit involved in DQM?
Artifacts
  • Evidence of self-attestation and enterprise ODM review
  • Evidence of Internal Audit engagement and review
Scoring

Not Initiated

There is no oversight of the DQM processes.

Conceptual

DQM oversight strategies and approaches are being discussed.

Developmental

Three levels of data quality review are being defined.

Defined

Three levels of data quality review have been identified and are being shared with stakeholders for review and approval.

Achieved

Three levels of data quality review have been implemented.

Enhanced

The process to ensure the auditability of DQM processes has a routine in place to identify opportunities for continuous improvement.

Profiling and measuring the data includes: 1) prioritizing the data in scope based on criticality and materiality; 2) defining and testing data quality rules based on business rules; and 3) measuring that the data is fit-for-purpose.

Description

The data in scope as defined by the business objectives must be prioritized based on its criticality and materiality to the data consumer business process.

Objectives
  • Define a process for prioritizing data.
  • Identify the scope of data subject to DQM, both current and historical.
  • Prioritize the scope of data in alignment with the DMS and business priorities.
Advice

An organization may establish data prioritization tiers. The DM policy and standards should define levels of data control to apply to each prioritization tier. The highest-level tier is a critical data element (CDE). Designated CDEs receive the highest-level of control to ensure the quality of these attributes is maintained. CDE designation is a controlled process to achieve agreement between the data producer and data consumer.

Questions
  • Has the process to prioritize data been defined?
  • Has the scope of data subject to DQM been identified, prioritized and verified?
Artifacts
  • Prioritized data domain inventories
  • Prioritized CDE inventory
  • Bi-directional communication about the inventories
Scoring

Not Initiated

Data subject to DQM has not been identified or prioritized.

Conceptual

The scope of data subject to DQM is being discussed.

The concept of CDEs is being debated.

Developmental

The scope of data subject to DQM is being identified and shared with stakeholders.

CDEs are being defined.

Defined

The scope of data subject to DQM is prioritized and aligned with both strategy and business priorities.

CDEs are verified.

Achieved

The scope of data subject to DQM is approved.

CDEs are designated and actively maintained.

Enhanced

The process to identify and prioritize all relevant data has a routine in place to identify opportunities for continuous improvement.

Description

Data quality rules based on business rules must be defined and tested to confidently validate the data is fit-for-use.

Objectives
  • Define a process for the development of data quality rules.
  • Define business rules which can be interpreted into data quality rules and used to measure the quality of data.
  • Define a process for the testing of data quality rules.
  • Establish an environment and tool set for the running and testing of rules.
  • Socialize DQ rules and test output to stakeholders.
Advice

Business rules are the basis for developing data quality rules necessary to profile the quality of the data. The data quality dimensions establish a range of potential rules that may be needed to determine DQ. A critical part of defining quality rules is to test the rule outcome. Testing is an iterative activity during the design of an individual rule. Testing and re-testing with each rule refinement is an essential activity for identification of the range of rules across the data quality dimensions. These dimensions are required for the data quality rule-set to accurately measure the DQ.

Data quality rules should be developed to test the various dimensions of quality. Not every data element can be tested for each dimension. There is an art and a science to writing quality rules. The rules and the range of rules will evolve over time. As new quality defects surface they will guide the design of new rules to detect those issues in the future. A mature data quality rule set is a coveted asset of an organization.

The ability to test data quality rules requires a testing environment. This may be a data sandbox where the data stewards can play with the data and experiment with various rules and their output. Additionally, the proper technical infrastructure to write, store, run and analyze rules is needed.

The skill set required to test rules may include business process subject matter expertise, DQ management expertise and technical infrastructure and coding expertise. Make sure you assess the skills and available resources carefully as this range of skill may not be available through a single individual.

Questions
  • Is there a defined process for the design and testing of data quality rules?
  • Are the data quality rules based on defined business rules?
  • Is there a data sandbox and appropriate tools for running data quality rule tests?
  • Have the stakeholders validated that the range of data quality dimensions applied in the rule set is adequate to determine the DQ?
Artifacts
  • Process design artifacts, procedure guides and published routines
  • Documented data quality dimensions
  • Criteria used to evaluate DQ
  • Data quality rules repository
  • Data quality rules recorded as metadata
  • Testing result reports and dashboards
  • Testing tools
  • Testing environment
  • List of stakeholders and evidence of bi-directional communication
Scoring

Not Initiated

No DQ rules or testing capability exist.

Conceptual

No DQ rules or testing capability exist, but the need is recognized and the development is being discussed.

Developmental

DQ rules and testing capability are being developed.

Defined

DQ rules and testing capability have been defined and validated by directly involved stakeholders.

Achieved

DQ rules and testing capability are established and are recognized and used by stakeholders.

Enhanced

DQ rules and testing capability are established as part of business-as-usual practice with a continuous improvement routine.

Description

The in-scope data must be profiled to determine the full spectrum of data quality dimensions (i.e., accuracy, completeness, coverage, conformity, consistency, timeliness, uniqueness). This analysis must include both a row-based analysis examining the accuracy of the record and a column-based, statistical analysis. Metadata must also be reviewed to ensure the description and intended use of data is properly defined.

Objectives
  • Define a process for profiling, analyzing and grading data.
  • Profile and statistically analyze in-scope data.
  • Review the metadata and perform gap analysis.
  • Measure, monitor and grade in-scope data.
  • Capture DQ metrics on a routine basis.
  • Report DQ metrics to business, data and technology stakeholders.
Advice

The DQM function is to establish that the data is fit-for-purpose and can be trusted. Data profiling creates a quality benchmark for the organization. Evidence of data profiling will be expected in any Internal Audit review or regulatory examination. Data needs to be assessed against both fit-for-purpose criteria and the data quality dimensions. DQ business rules need to be defined and memorialized. A statistical and columnar analysis should be included to ensure that data is reasonable. Certain data domain types, such as time-series data, need to be evaluated against additional criteria like gaps, spikes and abnormalities.

The primary stakeholders involved in this process are the data producer and data consumer. Ultimately quality is defined by the business process requirements of the data consumer and should be formally agreed to by the data producer. Creating a standard and automated process for routinely executing the quality metrics and reporting the results is critical to meet the time constraints of the data supply chain.

Metrics are used to track DQ and drive data remediation efforts. Control points along the data supply chain capture DQ metrics that are used to produce DQ dashboards. The requirements of the data consumer are used to establish quality thresholds for the data. These thresholds permit the grading of the data as to the defined levels of acceptable DQ based on the minimal requirements of specific data consumer.

The data profiling, analyzing and grading process should include a periodic review of the ethical use and outcome of the data as part of determining fit-for-purpose of the data. Particular attention should be paid to the use of proxy values in a data set.

A mechanism for executing DQ rules and generating outcome reports are required to support the data profiling, analyzing and grading process. The use of AI and ML may assist in the process.

Questions
  • Has the in-scope data been profiled, analyzed and graded?
  • Is DQ profiled against business logic rules as well as for reasonableness against statistical expectations?
  • Are the right business, operational, analytical, data and technical stakeholders involved in the process?
  • Have innovative technologies such as ML and AI been considered as part of the process infrastructure?
  • Has the review of data ethics been included in the process?
  • Are standard criteria for measuring DQ defined and verified?
  • Are metrics being collected and reported on a routine basis?
  • Are the results of data measurement and grading captured as metadata?
Artifacts
  • Business rules and data profiling and measurement criteria
  • Statistical analysis results
  • A mechanism for assigning and reporting grades for DQ
  • DQ metric reports, dashboards, heat maps and other forms of output
  • List of stakeholders and evidence of bi-directional communication
Scoring

Not Initiated

Data is not profiled, analyzed or graded for the purpose of assessing DQ.

Conceptual

Data is not profiled, analyzed or graded for the purpose of assessing DQ, but the need is recognized and the development is being discussed.

Developmental

Data profiling, analysis and grading, for the purpose of assessing DQ, is being developed.

Defined

Data profiling, analysis and grading, for the purpose of assessing DQ, has been defined and validated by directly involved stakeholders.

Achieved

Data profiling, analysis and grading, for the purpose of assessing DQ, is established and conducted by stakeholders.

Enhanced

Data profiling, analysis and grading, for the purpose of assessing DQ, is established as part of business-as-usual practice with a continuous improvement routine.

It is recognized as the normal way of working.

Data remediation plans must be developed and executed to resolve the most pressing DQ issues. The remediation must include both correcting the existing data and performing root-cause-fix to eliminate future data defects.

Description

Based on the current state analysis, remediation plans must be developed to address the most pressing DQ issues. Ongoing DQ evaluation and maintenance and timelines must also be established.

Objectives
  • Define a process for prioritizing and executing data remediation.
  • Develop and prioritize data remediation plans.
  • Prepare for immediate action to deal with high priority data remediation.
  • Establish timelines for ongoing remediation.
Advice

Data remediation is about correcting the defective data that has been identified. This data should be corrected as close to the source of data capture as possible. Make sure the remediation activities are not one-off processes, but rather established as part of the DQM routine. Data remediation needs to be implemented for both data-at-rest and data-in-motion.

Questions
  • Is a DQ issue prioritization process in place?
  • Have data remediation plans been developed, verified and prioritized?
  • Has appropriate funding been allocated?
  • Is there a communications process related to data remediation?
Artifacts
  • DQ defect reports
  • Data remediation plan
  • Evidence of issue prioritization
  • Evidence of remediation being accomplished
  • List of stakeholders and evidence of bi-directional communication
Scoring

Not Initiated

Data remediation is not prioritized, planned and actioned.

Conceptual

Data remediation is not prioritized, planned and actioned, but the need is recognized and the development is being discussed.

Developmental

Data remediation prioritization, planning and actioning is being developed.

Defined

Data remediation prioritization, planning and actioning has been defined and validated by directly involved stakeholders.

Achieved

Data remediation prioritization, planning and actioning is established, recognized and used by stakeholders.

Enhanced

Data remediation prioritization, planning and actioning is established as part of business-as-usual practice with a continuous improvement routine.

It is recognized as the normal way of working.

Description

Data remediation must include both correcting the existing data that is defective and determining the root-cause of the DQ deterioration to avoid the reoccurrence of defective data in the future.

Objectives
  • Define a process for conducting root-cause analysis and fixes.
  • Determine the data defect root cause.
  • Identify and implement corrective measures to business, data and/or technology processes.
Advice

Remediating DQ issues is not merely an exercise in data correction. DQ issues can be systemic. Evaluate the depth and breadth of DQ to determine if the organization is focused more on tactical repair versus the upstream remediation of a root-cause fix. A strong reporting structure is needed to ensure that upstream systems are aware of repetitive or continuing DQ problems.

Data defects may have a people, process, data or technical source. Having the right subject matter expertise from each of these areas will be important to the analysis of the root-cause.

Questions
  • Are root cause analysis problems defined?
  • Are corrective measures linked to root cause analysis?
Artifacts
  • Evidence of DQ defect reporting across the data supply chain
  • Evidence of root cause analysis and remediation being performed
Scoring

Not Initiated

No Root-cause analysis (RCA) process is defined.

Conceptual

No RCA process is defined, but the need is recognized and the development is being discussed.

Developmental

The RCA process is being developed.

Defined

The RCA process is been defined and validated by directly involved stakeholders.

Achieved

The RCA process is established, recognized and used by stakeholders.

Enhanced

The RCA process is established as part of business-as-usual practice with a continuous improvement routine.

The process is reviewed and updated at least annually.

Monitoring and maintaining the data includes: 1) implementing data quality control points; 2) capturing DQ metrics to identify defective data, and 3) continuous monitoring of the data.

Description

Data control points must be developed to quantitatively measure the quality of data as it flows through business and technology processes.

Objectives
  • Define a process to define DQ control points.
  • Put DQ control points in place and bring them to a fully operational state along the data supply chain.
  • Record DQ controls as metadata.
Advice

DQ is governed by developing control points along the data supply chain. DQ control points need to be applied at both the point of data entry into the organization and at the point of entry into the consuming application as well as when data moves and transforms along the supply chain. DQ controls include the implementation of business rules, establishing workflows, setting DQ tolerances and monitoring data movement.

Questions
  • Are control points defined, verified and documented?
  • Are business rules defined, verified, documented and approved?
  • Are business process flows defined and the way they handle exceptions verified?
  • Are control points, business rules and process flows operational?
Artifacts
  • Documentation on control points, business rules and process flows
  • Control process review and sign-off
Scoring

Not Initiated

No DQ control points are defined.

Conceptual

No DQ control points are defined, but the need is recognized and the development is being discussed.

Developmental

DQ control points are being developed.

Defined

DQ control points are defined and validated by directly involved stakeholders.

Achieved

DQ control points are established and recognized by stakeholders.

Enhanced

DQ control points are established as part of business-as-usual practice with a continuous improvement routine.

Control points are reviewed for relevance and accuracy at least annually and adjusted accordingly.

Description

Control points along the data supply chain capture DQ metrics that are used to produce DQ dashboards which are used to identify defective data. The DQ defects must be part of the issue management routine of the DM initiative. The DQ issue management process must track an issue to resolution and provide continuous stakeholder communication.

Objectives
  • Define a process for managing data issues to resolution.
  • Drive and prioritize remediation efforts using DQ metric reports.
  • Establish an issue management reporting routine and infrastructure.
Advice

Stakeholder engagement, inclusive of the data consumer, is critical for successful DQ issue management. The issues need to be managed through all stages of resolution. These stages include defect triage, prioritization, root-cause analysis, root-cause fix and remediation of defective data. Stakeholder communication throughout this process is critical and must include communication with the data consumer. The data consumer must be made aware of the defective data and its impact on their business process. They may need to participate in the analysis and determination of an acceptable resolution.

Important tools to support the resolution process are an issue log and a status tracking system. The link to the issue record should be part of the metadata for all instances of defective data. This record will communicate to all users across an organization and can be used to help minimize duplication of effort when an issue is uncovered at multiple points along the data supply chain.

Often, particularly in the early stages of the DM initiative, the volume of defective data may be greater than the resources required to resolve the issues. Documenting the prioritization process in the issue log even when it results in a backlog of issues is evidence that the issue was known and evaluated rather than new issues being uncovered as part of an audit.

Questions
  • Are DQ metric reports and dashboards distributed on a routine basis?
  • Are metrics used to identify DQ issues and drive remediation?
  • Are the DQ issues captured as metadata?
  • Are the right business, operational, analytical, data and technology resources involved in defining DQ requirements?
Artifacts
  • DQ dimension metrics
  • DQ metric reports, dashboards, heat maps and other forms of output
  • List of stakeholders and evidence of bi-directional communication
Scoring

Not Initiated

Data issues are not managed.

Conceptual

Data issues are not managed, but the need is recognized and the development is being discussed.

Developmental

Data issue management is being developed.

Defined

Data issue management has been defined and validated by directly involved stakeholders.

Achieved

Data issue management is established, recognized and used by stakeholders.

Enhanced

Data issue management is established as part of business-as-usual practice with a continuous improvement routine.

It is recognized as the normal way of working.

Description

Data is monitored at control points. Control points must be established where data enters a business process or when it enters a consuming application. To achieve continuous monitoring the data must be checked anytime there is data entering either type of control point. This monitoring may be real-time, a batch process or on demand.

Objectives
  • Define a process for continuous monitoring of DQ.
  • Establish an infrastructure for continuous monitoring of DQ.
Advice

The process of continuous monitoring has costs, benefits and operational challenges. Some form of automation is required to achieve continuous monitoring. Legacy systems often are not capable of continuous monitoring. It would be cost prohibitive to add these quality checks at the point of data capture or data use. The challenge is to then define a technical solution that allows execution of the quality checks as close to the point of data capture or load that is an acceptable cost.

Questions
  • Is there a process for continuous monitoring of DQ?
  • Is the infrastructure in place to support continuous monitoring of DQ?
  • Are the defined control points and respective data quality rules being monitored??
Artifacts
  • Schedule of DQ monitoring
  • DQ defect reports
Scoring

Not Initiated

Continuous monitoring at DQ control points is not performed.

Conceptual

Continuous monitoring at DQ control points is not performed, but the need is recognized and the development is being discussed.

Developmental

Continuous monitoring at DQ control points is being developed.

Defined

Continuous monitoring at DQ control points is defined and validated by directly involved stakeholders.

Achieved

Continuous monitoring at DQ control points is established and recognized by stakeholders.

Enhanced

Continuous monitoring at DQ control points is established as part of business-as-usual practice with a continuous improvement routine.

It is recognized as the normal way of working.

5 thoughts on “DCAM Framework – 5.0 Data Quality Management”

  1. “The issues need to be managed through all stages of resolution. These stages include defect triage, prioritization, root-cause analysis, root-cause fix and remediation of defective data.” Are these stages of issue resolution defined anywhere? Specifically interested in defining the difference between ‘Defect Management’ and ‘Issue Management’ and how they relate to the generic term ‘Remediation’.

    In reference to Component: 5.0.0

    1. Issues Management is the overall process of taking an issue and seeing it through to resolution (even if the resolution is a conscious decision not to resolve).

      The quotes you provided do not use the term Defect Management but Defect Triage. Defect Management is a synonym for Issues Management. Defect Triage, however, is a sub-process of Issues Management. The sub-process is investigating the issue to determine a potential cause and the SMEs that are required to resolve the issue. My experience with defect triage for data quality issues is to decide whether or not it is a technical issue, data architecture issue, or a process issue. If you can determine the source of the problem to be one of these, then you know what SMEs need to be involved in the root cause fix of the issue. Remediation is two-fold. First, I have bad data in my data set I need to cleanse it. The second part is why and where did the bad data get in my data set – this is the root cause fix process. Where did it break, why did it break and what can I do to fix it so it won’t break in the future.

      In reference to Component: 5.0.0

  2. “5.3.2 The ODM has an executive owner”

    It seems that the title of (sub)capability was mistakenly copied from 2.3.2.

    In reference to Component: 5.3.2

    1. Jun,

      Thank you for bringing this to our attention.

      The issue has been resolved and the sub-heading has been renamed “Root-cause analysis (RCA) process is defined)”.

      In reference to Component: 5.3.2

  3. “Profiling” in this page should systematically be preceded with “Data” because the of hyperlink to Glossary’s Terms details :

    “Data Profiling”
    Definition: The process of evaluating and grading a given source of data to determine whether it is fit-for-purpose.

    “Profiling”
    Definition: Is defined by the GDPR as any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person…

    In reference to Component: 5.0.0

Leave a Reply

Be a thought leader, share your best practice with other industry practitioners. Join the DCAM User Group or the CDMC Interest Group (or both). Then share this invitation with your fellow members - let’s get the crowd moving.
Join the Crowd