This is the third in a series of columns describing an approach for certifying data quality. In the first column, we introduced the concept and three levels of certification; last month we explored the data quality certification oversight committee; this month we will describe a certification process. The certification process depends on the level of certification to be provided.

The first step in the certification process is to establish a team to conduct the review. This team will need to review the business processes (for the gold level), the operational systems environment (for the gold and silver levels) and the data warehousing environment (for all levels). All teams should include a representative of the data warehouse team, a representative of the responsible business area(s) and a representative of the support group responsible for the source systems for the data being certified. The team member may vary based on the data being reviewed (e.g., the person should be an expert for the data domain) and the certification level. For bronze level certification, for example, a business analyst who can attest to the requirements is sufficient; for gold level certification, the business representative must be someone with access to detailed knowledge of the business processes and with the ability to influence them. A representative from the auditing or regulatory areas is needed for the gold level, because the certification indicates compliance with governing regulations.

Basic information must exist to attain any certification level. This information includes the business definition of the data element(s) and the quality expectations. If deficiencies are uncovered, the team must then identify the steps needed to bring the data to the desired certification level.

Bronze Level: The bronze level certifies that the data was processed properly within the data warehousing environment. Remember, this level of certification merely indicates that the data warehousing migration programs did not introduce any errors. It does not provide any guarantee of the accuracy of the data, because errors may exist in the operational data captured by the data warehouse. To attain this level of certification, we need only focus on the requirements definitions and the audit and control processes and measurements embedded in the data warehousing data migration tasks and programs.

The completeness of the requirements definition is our first stop. Explicit transformation, integration and cleansing rules provide us with the basis for the certification. Armed with this information, the certification team can review the transformation jobs and programs to ensure that they are performing according to the requirements. The review may require the creation and execution of test scenarios to ensure that both correct and incorrect operational data are processed properly. In addition to verifying the end results, the team must also ensure that appropriate metrics (e.g., record counts, hash totals, data distribution statistics) are in place to provide feedback on each execution of the process.

Silver Level: The silver level certifies that the data was processed correctly from the first point at which it was electronically captured. In addition to the bronze level review, the operational systems environment needs to be reviewed. Similar to the data warehousing environment, this review includes an examination of the business requirements, a verification that the system properly performs these requirements and assurance that appropriate audit and control metrics are in place.

The business requirements documentation must define the processes being automated, the data validation requirements, the error handling processes and the audit and control metrics that are needed. Because the requirements provide the basis for the certification, if they don't already exist, the business team representatives need to develop appropriate process descriptions. Once the process descriptions or requirements are documented, the application system is reviewed to ensure that the programs reflect the business needs. In addition, tests are performed on the data to ensure that these programs operate properly. Audit and control metrics similar to those defined for the bronze level need to be in place to provide ongoing information on the successful operation of the applications.

Gold Level: The review for the gold level is the most complex. It involves two major activities not included at the other levels. First, the quality expectations for the business process must be established. For example, the company must decide the importance of ensuring that a piece of data (e.g., customer gender) is correct. Once the expectation is established, the review must ensure that both the business process and the operational system meet the expectation. The operational system can merely ensure that the gender is valid (e.g., male or female); it is the business process that must ensure it is correct.

Business procedures provide the basis for the gold level certification. A good starting point is a review of the company procedures associated with the data capture. If no such procedures exist, then these must be created; if the existing procedures have no metrics, then measurements need to be added. The metrics are typically programmed into the supporting systems to capture the data. An example of a metric may be the percentage of people with certain birthdates. If significantly more than .3% of the population has a birthdate of 11/11/1911, a process to review the business data capture approach should be initiated because this is probably indicative of someone merely inserting data that will pass an edit check.

The business process review is often a complex task because many variations legitimately exist in the processes. For example, a phone interview or survey may be used for small customers, with personal interviews being conducted for larger customers. At a bank, some deposits are handled by tellers while others may be handled by ATMs. Also, the data is frequently captured at multiple locations, not all of which are governed directly by the company. For example, independent insurance brokers capture data for insurance companies. Additionally, some functions, such as a call center, are frequently outsourced, and companies depend on the service bureau to provide the needed information.

An added complexity of the gold level is that the mere investigation of the business process at a detailed level may uncover deficiencies. If these deficiencies inhibit capturing quality data, then changes to the business process must be made before the data can be certified. These changes will require significant business participation and will require support from executives with the authority to modify the existing practices.

In this and the previous two columns, we introduced the concept of data quality certification. The overall objective of the certification is to increase user confidence in the data they receive from the data warehouse environment. When the data has achieved the gold level, then executives and others within the company can use the data with confidence. They know that not only does the data meet the quality requirements, but associated business processes are also documented and adherence to these requirements is being tracked.

The key to successful deployment of a data quality certification program is to start slowly. To move forward, establish an initial data quality certification oversight committee and provide them with appropriate education on data quality concepts. Once that is done, determine the data that most likely has the highest priority with respect to certification (i.e.., data that supports regulatory reporting or major key performance indicators), assess its quality and estimate the effort to certify it. The committee can then sanction and fund certification effort(s) for the highest priority data groups and designate the certification team. The team can perform the appropriate certification processes and either recommend the certification level or identify steps that must be taken to meet that level. Corrective measures can then be performed and, when completed, the data will be certified.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access