This is the first of a three-part series that describes some fatal misconceptions about information quality that can cause information quality initiatives to fail or appear to succeed but fail to achieve positive business results. If an organization misunderstands the fundamentals of information quality improvement, it will fall into the same trap as every other "methodology" silver bullet.
There are seven potentially fatal misconceptions about information quality that can hamper an information quality initiative. If these misconceptions are strongly held, they will hamper business effectiveness (best case) or result in business failure. The seven deadly misconceptions are listed below.
This month's column discusses the first three of these misconceptions.
Misconception 1: Information quality is data cleansing.
Some think that by "cleansing" or "correcting" data they are improving information quality. Not true. While data cleansing and correction "improve" information product quality, it is merely information "scrap and rework." Like manufacturing scrap and rework, data cleansing is merely rework to correct defects that would not be there if the processes worked properly. Data cleansing is required for any data warehouse or conversion project to succeed. If data in a data warehouse is nonquality, the warehouse will fail.
Data cleansing and correction are, simply put, part of the costs of nonquality data. Every hour consumed and dollar spent correcting data is an hour or dollar that cannot be spent doing something that adds value. Information quality is quality data produced at the source. Information quality improvement databases are designed properly and the processes are defined and operating properly.
Information quality improves processes to prevent defective data from being created. Data cleansing attacks the symptoms of a problem. It fixes the results of faulty processes. Information quality attacks the root causes of data defects and eliminates the causes of nonquality information. A truly effective data "cleansing" function is one that works itself out of a job! It will transform itself from an information scrap and rework function to a function that facilitates data defect prevention.
Misconception 2: Information quality is data assessment.
Another common misconception is that information quality is data assessment. No, again. Data audit, analysis or assessment is simply inspection. The immediate goal of assessment is to discover defects. While some data is so important it must have regular audits and controls in place, data assessment is a cost activity that does not, in and of itself, add value. Assessment of data quality has value when it is used to raise awareness of process failure and results in process improvements that eliminate the causes of defective data.
The ultimate goal of information assessment must be to assure that processes are creating and maintaining information quality that consistently meets all information customers' requirements. Discovery of unsatisfactory information quality must lead to information process improvement and control.
Information quality minimizes data assessment because information quality is designed into the processes and controlled during information production. An effective information quality function uses data assessment as a tool to improve the processes that create, maintain and deliver information.
Seven Deadly Misconceptions about Information Quality
Misconception 3: Conformance to business rules is the same as data accuracy.
There is a temptation to equate the fact that data that conforms to business rule tests applied in automated data analysis means the data is accurate. Data that conforms to the business rules simply means the data has validity. That is, it is a valid value according to the defined rules. The reality is that many data errors are valid values that conform to all specified rules, yet are incorrect. One bank discovered that it had 1,700 customers who had a birth date of November 11, 1911 a proportion out of normal frequency for their customer population. The data was valid, i.e., it fell within the range of valid birth dates for its customers. However, virtually every one of those values was inaccurate. The cause? The edit rules required a value for birth date. When the information producers did not know it, they simply entered "111111" the fastest valid date they could enter!
Automated data quality assessment software that tests data only for valid values and conformance to other reasonability and calculation rules must report the data as having validity and clearly indicate this does not imply accuracy. Confusing validity or conformance to business rules with accuracy can create a false sense of security that the processes are working properly when, in fact, they may not be. The results of decisions made based on valid but inaccurate data can be just as devastating as the results of decisions made on invalid data values.
Data is a representation of real-world objects or events. Data accuracy means that the data correctly represents the real-world object or event it characterizes. Accuracy means the facts are correct.
Next month I will describe the next two misconceptions.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access