JUN 1, 2007 1:00am ET

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Enhance and Expand BI with Mobile
Available On Demand

Data Quality Cycle 2.0

Print
Reprints
Email

In the late 1990s, we were hired by a client to lead a data quality initiative on a large project that involved migrating data from legacy systems to open architecture. In developing the program for our client, we immersed ourselves in a tremendous amount of literature from respected authors on principles of data quality. What we discovered is that a great deal of the data quality work we reviewed was focused on principles and theory but lacking in the area of implementation.

As a result, we struggled with exactly how to design and implement a practical data quality program for our client. What our client required was a straightforward methodology for implementing, practicing and promoting data quality. Our task was to develop a methodology that was repeatable with well-defined objectives and deliverables. The result was the data quality cycle 1.0.

Over the course of the last 10 years, we have since refined the DQC, which is now in its second release (DQC 2.0). In this article, we will describe the end-to-end data quality process that we have developed, which can be implemented in virtually any environment. Hopefully you will gain valuable insight into developing an intelligent, standardized data quality strategy that you can implement at your company with tremendous success.

Figure 1: The Data Quality Cycle

Figure 1 represents the basic DQC. In later sections, we will examine individual processes and decision points within the cycle. However, at its most granular level, the DQC 2.0 consists of four main phases: discovery, definition, remediation and prevention. As with any quality effort, quality improvement is a continuous effort. For this reason, phases within DQC 2.0 are iterative and represent a cycle rather than an endpoint.

Discovery

The DQC begins with the discovery phase, which is illustrated in Figure 2.

Figure 2: Discovery Phase

The first step in the discovery phase is to identify a candidate data quality problem. If there are several pain points, this step may require prioritization. Not all data quality errors are equally important. The business should decide which problems have a direct impact on the strategic or tactical goals of the organization. If, for example, customer satisfaction is an important business driver, then quality customer data is an obvious prerequisite. In this case, customer data quality problems may be ranked high on the list as opposed to data quality problems with secondary business drivers.

The second step in this phase is to estimate the cost of the problem. The third step requires estimating the solution to the problem. Estimating the cost of a data quality problem and the remediation are extremely important steps; however, they are frequently neglected.

The reason it is imperative to estimate both the cost of the problem and the remediation is that there must be ample justification for every data quality initiative. While this is only an estimate, it will provide cost justification for proceeding to a thorough assessment and deploying a remediation strategy. This will help "sell" the data quality initiative to senior and executive management and engender the necessary support.

Once you have estimated the cost of both the problem and remediation, the next step is to establish the objectives of the data quality initiative. There must be clear and concise objectives that can be validated to ensure the strategy is effective.

Definition

Following the discovery phase is the definition phase. The process flow for this phase is illustrated in Figure 3.

Figure 3: Definition Phase

The definition phase is where the heavy lifting begins. The first step is to define the measurement criteria. Regardless of whether you are verifying product quality or data quality, measurement is essential. This step should answer the fundamental question: what are we going to measure? This will include measurements of data type conformance, syntax, completeness, precision, validity, accessibility, timeliness, etc.

With the measurement criteria firmly established, it is time to develop your assessment plan. This will not only include what will be measured, but also where and how the measurements will be applied. You may be required to assess the data in the many systems that contain the data to get an accurate picture of its quality. Depending on the system and business processes that impact the data, you can define how the metrics and measurements will be applied.

After the plan and measurement criteria are defined, it is time to assemble the team. This should include highly skilled workers that are well-versed in quality measurement techniques and methods. Team members will also include business knowledge experts familiar with both processes and data. The team must be given the charter to assess defects that is mandated by senior management in order to alleviate push-back from data producers and consumers who may feel as though their data is being scrutinized.

The next step is to measure the defects and to determine the true cost of the defects that are discovered. It may be that, once the analysis is complete, the actual costs differ significantly from the estimated costs arrived at in the discovery phase. What may have been considered a high cost (and therefore, probably high priority) data quality objective is not as costly as believed - or estimated low-cost data quality issues may turn out to be significant.

At this point, you will compare the actual versus the estimated costs. The result of this comparison and the ability to tie these costs back to your primary data quality objectives allow you to manage your data quality remediation efforts and maximize the limited amount of resources you have to get your biggest benefit.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.