Why don't the traditional data quality efforts of "detect and cleanse" work in improving the quality of data? The father of many quality management practices, W. Edward Deming, noted in his 14 points of quality: Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place.1
This rule applies to the product of information. Inspection does not improve data quality; it only tells you that there is a problem. Cleansing the data after the fact does not remedy the problem - it only masks the problem. Companies are spending millions of dollars on initiatives to detect and cleanse data rather than applying the resources to actually improve the quality of their information. The best way to improve data quality is to produce quality data.
Proactive Data Quality Principles
Before we delve into the details of "proactive data quality," it is important to understand that data is a product. It is both produced and consumed. Therefore, there are common quality management practices that apply to the management of data. These are the very same principles practiced successfully in Japan at companies such as Toyota, Honda, Seiko and others that are renowned for producing quality products.
Producing a quality product is only possible if the system that produces the product is a quality system. This is an important point. Successful Japanese companies expend a great deal of effort investing in quality improvements in order to produce a quality product.
A quality system has several important characteristics, some of which include:
- Quality focus
- Efficiency
- Standardization
- Accountability
These very same principles of quality management apply to the management of information as well.
Quality Focus
The unprecedented quality of many Japanese products is certainly no accident. It is the result of the quality revolution that occurred in Japan during the 1960s. Prior to that time, Japanese goods were known for being both cheap in price and cheap in quality. Neither is true today because quality literally became the number one priority for the entire country!
In order to achieve quality, the focus must be on quality. Focusing on quality in information management produces quality data. Here is an example of information quality focus. An insurance company was plagued by a significant amount of data quality problems regarding its customers' information. They spent millions of dollars deploying a new application and even purchased a data quality analysis tool to help ameliorate data quality. Despite these efforts, data quality errors continued.
After reviewing the process, the source of the problem was clear. The company rewarded its customer service representatives based on the number of calls that were received per hour and the number of claims entered into the system. This naturally resulted in enormous data quality problems because the focus was on quantity rather than quality.
Management instituted a new process that rewarded its representatives for the quality of the information entered rather than the number of calls received. For this company, the number of calls received by each representative decreased by a small margin, but data quality improved markedly. This is proactive data quality in action!
Efficiency
Efficiency is absolutely essential in delivering quality. Removing unnecessary repetitive tasks and processes reduces possible points of failure. For successful Japanese companies, efficiency is one of the highest priorities and a means of producing quality products and services. The Japanese term Kaizen literally means "continuous improvement" and is a common business practice in Japan. Kaizen has lead to unprecedented quality improvements in Japanese manufacturing.
Data redundancy is a perfect example of data management inefficiency. It is not uncommon for companies to copy and propagate data from system to system. Customer, inventory, product, sales and other records are maintained in a variety of databases, files and mainframe segments.
Because of data redundancy, many companies find that copied data is nearly impossible to synchronize and reconcile. What data in which system is the correct data? This is a question that is nearly impossible to answer for many organizations. That is why data redundancy is perhaps the single, greatest cause of data quality errors.
Data redundancy is completely unnecessary. Information is the only reusable resource in the company; that means that it can be used by multiple users at the same time! Consequently, information needs to be stored and maintained in one location. This is efficient data management, which leads to quality data.
Standardization
Variation also tends to decrease quality. The way in which you remove variation is through standardization. Standardization is a hallmark of a quality system that produces a quality product. This is true in the data world as well.
Companies must have clearly defined, up-to-date standards to manage the organization's information. These standards include data architecture, data modeling, ETL, database administration, data stewardship, metadata, etc.
Standards must be implemented and followed. They must also be kept current. If your data standards have not been updated in more than three months, it is safe to assume that your standards are out of date and probably irrelevant.
Accountability









