The following article is excerpted from the white paper, "The Overall Approach to Data Quality ROI" written by William McKnight. For a copy of the full paper, please visit or

Data quality is an elusive subject that can defy measurement and yet is critical enough to derail any project, strategic initiative or even a company. The data layer of an organization is a critical component because it is so easy to ignore the quality of that data or make overly optimistic assumptions about its efficacy. Having data quality as a focus is a business philosophy that aligns strategy, business culture, company information and technology in order to manage data to the benefit of the enterprise. It is a competitive strategy. One day, our markets will expect data quality. In the meantime, each company has the opportunity to differentiate itself through the quality of its data. Leading companies are now defining the marketplace data quality expectation.

A parallel trend to data quality improvement in the marketplace is the return of return-on-investment measurement systems for technology-based initiatives. No longer is it acceptable to throw money at problems, target soft measures or lack accountability for results with technology spending. Many executives are also demanding payback for quality initiatives.

It is important to note that there are many benefits that accrue from improving the data quality of an organization. Many of these benefits are intangible or unreasonable to measure. Benefits such as improved speed to solutions, a single version of the truth, improved customer satisfaction, improved morale, an enhanced corporate image and consistency between systems accumulate; however, you must selectively choose the benefits on which to conduct further analysis and convert to hard dollars. ROI must be measured on hard dollars.

A program approach to data quality is required to measure data quality ROI. Data quality improvement is not just another technology. We must change our way of doing business to fully exploit data. Investments in the technologies as well as in organizational changes are necessary to reap the full rewards. Data quality is right in the sweet spot of modern business objectives, which recognize that whatever business a company is in, it is also in the business of data. Those companies with more data, cleaner data, accessible data and the means to use that data will come out ahead.

How should one begin to justify data quality improvement? The cleansing process and maintenance will cost money and may require dedicated staff. Funds dedicated to data quality improvement will not be released until the time that the cleansing will improve the reliability and accuracy of key business processes such as trending, analysis or billing for product sales. Data quality can and must measure its success based on its contribution to the improvement of such objectives.

However, there has not been a methodology to articulate and improve data quality ROI until now. You can't improve what you can't measure. Therefore, we need a means for measuring the quality of our data warehouse. Abstracting quality into a set of agreed data rules and measuring the occurrences of quality violations provides the measurement in the methodology.

Steps one (system surveying) and two (data quality rule determination) are detailed in full in "The Overall Approach to Data Quality ROI." Step three (data characterization) follows.

Step 3: Data Characterization

It's one thing to sit back and pontificate about what rules the data ought to conform to. The next step is to determine the data quality with a data characterization and prioritization exercise. Typically, no one can articulate how clean or dirty corporate data is. Without this measurement, the effectiveness of activities that are aimed at improving data quality cannot be measured.

Measuring data quality begins by taking inventory. By taking account of important data across the several tangible factors that can be used to measure data quality, you can begin to translate the vague feelings of dirtiness into something tangible. In so doing, focus can be brought to those actions that can improve important quality elements. Ultimately, data quality improvement will be performed against a small subset of data elements in the body of corporate data, as you will find most elements already conform to standard. However, the subset must be selected carefully. Said differently, data quality initiatives will not be comprehensive across all corporate data elements and all possibilities.

Data quality is not a perfect science and our quality efforts may never yield 100 percent adherence to the rules unless the bar has been set too low. Data quality is no place for perfectionists, but rather a place for those who understand its value proposition.

Data characterization can be performed with SQL or similar queries against the data showing the spread of data in the systems and checking for rule adherence.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access