Information quality is not about the quality of data in databases and data warehouses!

Information quality is about the quality of business communication in all forms ­ spoken, written, e-mail, reports, Internet, intranet, policy manuals, catalogs, newsletters, as well as databases and data warehouses.

But more than that, information quality is about serving our customers by providing information products and services that "consistently meet end customer and internal knowledge worker expectations" so they can be efficient and effective in performing their work and successfully meet their goals.

But there is more. Information quality is about learning from our experiences, improving our processes and sharing those experiences so that we and others do not make the same "mistakes" of the past. This is knowledge. This retained and shared learning is very much a part of the principles and processes of information quality (IQ).

The last three DM Review "Plain English on Data Quality" columns described seven fatal misconceptions that can cause information quality initiatives to fail.1 This article describes information quality and its role in enabling the "intelligent learning organization."

What is Knowledge?

Knowledge cannot be defined without defining its building blocks, beginning with the concept of data. There is a value chain to translate data into intelligent actions illustrated in Figure 1. Data is simply the representation of facts, and as such forms the basis for intelligent actions. E=mc2 is data, representing an important fact that is expressed as a physics formula.

Figure 1: The Wisdom Value Chain

Information is data in context; you know the meaning of the data. In this formula, "E" represents the physics concept of energy or the ability of matter or radiation to do work because of its motion or its mass or its electric charge. The "m" represents the physics term mass or measure of the quantity of matter that a body or an object contains, and "c" represents the speed of light. The meaning of this formula is that the amount of energy can be calculated by multiplying the mass of material transformed to energy by the square of the speed of light. So what? Here's what: understanding the significance of this can lead to exploiting the information.

Knowledge is information in context; you know the significance of the information. Translating information into knowledge generally requires experience and reflection.2 Understanding the significance of the formula E=mc2 led to the development of the capability to harness nuclear energy. But knowledge is not the end goal because knowledge can be used or abused.

Wisdom, the ultimate goal, is knowledge in context. You act appropriately as a result of the knowledge. How many times do we understand the significance of something but fail to exploit an opportunity? Wisdom requires empowerment and the courage to act. If a clerk hears a complaint (data + information) from a customer, recognizes (knowledge) the possibility of losing the customer, yet fails to try to resolve the complaint, the clerk has not acted wisely. The clerk may be individually foolish, not caring about the loss of a customer. Or the clerk may not be empowered to intervene and attempt a resolution as a result of corporate foolishness.

Knowledge management is the application of management principles and methods to the intellectual assets of the enterprise to increase business performance. Intellectual assets are the aggregation of data, information in all forms ­ knowledge, human expertise and know-how ­ of an enterprise that may be exploited to add value to the enterprise. Knowledge management includes capturing customer complaints, actions to resolve them, the end results including retention and analysis as to patterns of effectiveness. This becomes managed knowledge when it is shared with others who are points of contact with the customer, so they learn from other's experiences.

If data and information are required for knowledge and if wisdom is appropriate action based on knowledge, where does information quality fit? To answer this question requires a clear understanding of information quality.

Information Quality as Product Characteristics

If information quality is not about quality of data in a database or data warehouse, what is it?

First, information quality is broadly defined as "consistently meeting knowledge worker and end-customer expectations" through information products and services.3 Implied in this are several important notions:

  • Information is a product that can be characterized as "quality" or "nonquality." Information is the product of business processes that cause its creation and maintenance, whether in a database, on paper or in some other form. In the information age, data in all forms must be managed as direct products and not as byproducts as was the case in the industrial age. Approaching information as a byproduct "places its focus on the wrong target, usually the system instead of the end product, the information."4
  • Because information is a product, "the same principles of quality improvement that are applied to manufacturing processes to improve manufactured product quality can be applied to business processes to improve information product quality."5
  • Information quality exists only in the context of customers who use information. Products have value only when customers need them. Information has value only when knowledge workers need it to perform their own processes and accomplish their objectives. Only customers of information can ­ and will ­ assess the information quality based on how well it helps them be successful in their work. Shirou Fujita, the CEO of NTT DATA, the first information services company to win the coveted Deming Prize for quality, says, "In this field [information systems] we too often tend to emphasize achieving technological advances without paying enough attention to the needs of our consumers, the people who use our systems...Information systems will have to be even more user friendly than they are today and will have to better serve society's needs."6
  • Information has a customer-supplier value chain and relationships. The suppliers are information producers who perform processes that create actual information. Data intermediaries take information in one form, on paper for example, and transcribe that into an electronic form. They play an intermediary role in getting information from the supplier to the information customer. (See Figure 2.) The customers are knowledge workers, internal and external, such as end customers and regulatory bodies who use the information in any way. Most people perform both roles. They are knowledge workers using some data to create other data, as when a loan officer reads customer credit data to create a loan.
  • Every process owner or manager of processes that create and maintain data is accountable for the information products produced by their processes, not just physical products. Process integrity cannot be divorced from the integrity of the products produced. Further, that accountability extends to the obligation of the quality of that information to meet the needs of downstream knowledge workers, not just the process owner's immediate department needs. They must know the collective information requirements of all downstream knowledge workers for the information produced by the processes they manage. Failure to do this requires the downstream knowledge workers to have to go out of their way to discover or correct the data about events and objects not captured correctly during the initial create processes.

Information Quality Components

There are three components of information quality, each of which must have quality. They are:

1. Data Definition Quality. Data definition represents the "information product specification" in the same way a "product specification" controls the production of a manufactured product. Data definition as used here represents not just the meaning of data, but the data name, meaning, domain value set or specification, and business rules that govern data integrity. Data definition requires data standards to assure consistency of data naming and definition. Without quality of the information product specification, you cannot control the consistency of the information product produced. Data definition is much more than just documentation of data, or even information product specification. Business terms and data form the language of the enterprise. When business terms and data are poorly defined, communication fails. Quality of all aspects of data definition facilitate the communication in all forms ­ oral, written, database and data warehouse, as well as Internet and intranet communication.

2. Data Content Quality. Data content is not just values in database fields; it also includes the content of Web site pages, data in manually prepared reports, data communicated in telephone or face-to-face conversations, printed on advertisements or signs, transmitted via e-mails or in any other form. Data represents real world objects and events (entity types) and facts (attributes). As such, accuracy, non-duplication and concurrency of redundant data are the most important inherent quality characteristics. Accuracy means that data correctly reflects the real-world object or event being described. Data with inherent quality can be combined to form new information accurately. With an accurate birth date you can calculate "insurance age" from a given date, for example, or categorize a customer into a specific behavioral profile.

3. Data Presentation Quality. Data presentation quality consists of several pragmatic quality characteristics. Data has pragmatic quality when it enables all knowledge workers to effectively use it to perform their work processes and accomplish the enterprise objectives. Pragmatic characteristics include timeliness (you get the information when you need it), contextual clarity (you can easily understand the information as presented) and fact completeness (you get the right information to perform your work).7 Pragmatic quality means data is fit for all purposes requiring it, not just one purpose or use.

While you can describe the characteristics of what constitutes data quality, information quality is more a value system and a habit. Information quality is an environment or culture of continual process improvement to eliminate the causes ­ and costs ­ of defective data. Deming emphasizes this: "Quality is not something that is included in a product, but something fundamental within the minds of business leaders."8

Information Quality as a Value System

Information must be valued as a key enterprise resource to be properly managed. Further, the information customers are valued because they depend on quality information to perform their jobs. "The consumer is the most important part of the production line,"9 Deming states as a ramification of his first point of quality. The information quality value system is one in which one values their information customers. In this value system, everyone's motto is, "I will serve my information customers to meet their needs and help them be successful." In a world of "me" and "my" and WIIFM (what's in it for me), this is counterintuitive. "What do I get if I spend the time to capture the information you need? This will keep me from meeting my objectives." The reality is that if everyone in the organization subscribes to information quality, everybody wins ­ especially the end customer. By consistently meeting the end customers' needs, they become more loyal and customer lifetime value increases, increasing the profits of the enterprise and its shareholder value. That, in turn, increases jobs, morale and pay.

Masaaki Imai, founder of Kaizen Institute and world renowned for his work in Kaizen (continuous process improvement involving everyone in the organization), writes, "Quality begins when everybody in the organization commits to never sending rejects or imperfect information [emphasis mine] to the next process. Dr. Kaoru Ishikawa's axiom, 'The next process is the customer,' refers to the internal customer within the same company. One should never inconvenience the customers in the next process by sending rejects to them. In gemba, such a state of mind is often referred to as 'Don't get it. Don't make it. Don't send it.' When everybody subscribes to and lives by this philosophy, a good quality-assurance system exists."10 What is the ramification? If information producers and process owners treat their information customers this way and their information producers treat them likewise, everyone wins. This will fundamentally change the efficiency and effectiveness ­ and profitability ­ of the enterprise.

Information Quality as a Habit

Stephen Covey emphasizes that the characteristics of personal effectiveness are not one-time actions ­ they are habits.

There is a Japanese word that every information quality practitioner, business and IS manager and employee should know. It is the word "muda." Muda means waste. In Kaizen, muda refers to all the things that do not add value such as overproduction, inventory, transportation, waiting and correction. Muda elimination (all forms of waste and non-value-adding activities) is one of the major principles of Kaizen.11

Everyone in an information quality environment will come to have a rabid intolerance of information muda or waste, eliminating it and, with it, the costs of nonquality data. Nonquality organizations today accept the costs of information scrap and rework and information muda as normal costs of doing business, because they do not see the correlation between information muda and lost profits. Quality organizations see this waste, despise it and work to eliminate it wherever it is found.

The habit of information quality exists when people come to recognize information problems, question why they occur, determine the root cause and implement corrective actions to the processes to eliminate recurrence of the problems.

The meaning of this in data warehousing is as follows. When you find defective data in the source databases, correct it ­ or the data warehouse will fail. But data cleansing is a muda activity ­ one that adds costs to the data warehouse. Data cleansing, while necessary given the state of information quality, is muda precisely because it would not be necessary if source data was properly defined, created and maintained. But if you shrug off known data quality problems as somebody else's responsibility, you will be sanctioning information scrap and rework as a valid and permanent process. Information quality means notifying the source process owners of the problem, analyzing the root cause with the information producers and implementing process improvements to eliminate or minimize the need for subsequent data cleansing (muda). This is the habit of information quality.

Information Quality and Knowledge Management

Data and information can exist in the absence of people. Knowledge, however, can only exist as a characteristic in people when they apply their experience and reason to recognize the significance of information. Wisdom requires action of people that leads to the accomplishment of good results. Wisdom requires empowered and motivated people to act on knowledge.

Knowledge management requires the gathering of the collective knowledge from across the enterprise, storing it and making it shareable with others. It is in this process of sharing the lessons learned that knowledge can be leveraged. It is this leveraging of knowledge and information that multiplies the value of those intellectual assets. If only one person in the organization knows how to sell products effectively to a specific customer profile, you gain the value ­ and profits ­ of that know-how one time. If, however, that knowledge, rules of thumb and tricks of the trade can be captured and shared with 100 people, the value and profits of those intellectual assets are multiplied 100-fold.

But suppose the information used to determine a customer's profile is inaccurate. The value of the "know-how" is sub-optimized because it will be applied on the "wrong" customers. Worse, it may alienate and cause the loss of a customer. Consider the travel agency rep that called to promote a "dream second-honeymoon vacation" to a "qualified" couple only to find out the wife had died six months earlier.

Furthermore, the rules of thumb representing the knowledge and experience could be faulty or defective. For example, the success achieved was the result of the individual's personality rather than the methods or techniques used. The assimilation of these techniques by others would be wasted on all trained in them.

Because of the information value chain, information produced in only part of the business may have quality for those in that business area, but lack quality for knowledge workers in other business areas downstream. When this occurs, it creates a dysfunctional learning organization. Downstream processes fail. Knowledge sharing fails. Business costs go up due to information scrap and rework and process failure. Profits and revenue go down due to missed and lost opportunity.

How to Enable the Intelligent Learning Organization

Every live organism learns as it grows. When learning stops, life stops or is stunted, and the organism becomes dysfunctional. But if the organism learns wrong behaviors, the results can be devastating. If a child learns aggressive behavior as a way to deal with conflict, it can lead to mass killings as we have seen all too frequently in our society.

Every organization is an organism, and likewise learns as it grows. If the organization fails to learn from its collective experiences, its life can likewise be stunted. The data stored in the organization's databases and other data sources constitutes its long-term memory bank. Its short-term memory is represented by data and information retrieved and used to perform its processes and make its decisions. When that data is inaccurate or missing, the processes performed and decisions made will be impaired. The result is the dysfunctional learning organization.

The intelligent learning organization consists of all business areas creating and sharing information to the quality standards required by the other business areas to effectively carry out their work. How does an organization become an intelligent learning organization?

  • Begin the information quality journey, recognizing that all growth takes time. No one gets there without starting.
  • Conduct an information quality maturity assessment to determine the status quo. A child cannot succeed at the university if they cannot master the subjects of primary school. Organizations must start with where they are and grow from there.
  • Raise awareness of the costs of information quality problems by taking a sample of important data and measuring the extent of the problems and quantifying the costs of information muda in all forms.
  • Begin a continuous, formal education process in the principles of information quality and management's accountabilities in creating the quality culture with management at the highest level you can access.
  • Analyze every process you perform and ask, "Does this add value or does it only add cost?" Eliminate the cost-adding work (muda). Maximize the value-adding work.
  • Develop intolerance for information muda or waste in all forms. Do not create defective information products, do not pass defective information products to others and do not accept defective information products from others ­ send them back.
  • Identify your own information customers, adopt a customer satisfaction mind-set toward them, and go to them to find their requirements and expectations of the information products you provide. Then help them be successful.
  • Identify your information producers, go to them and find the problems they have in providing the quality information products you require. Work with them to help determine the root causes of problems and improve the processes permanently.
  • Establish partnerships between the information producers' areas and their information customer areas.
  • Develop and provide adequate training for the information producers and process owners as to their information customers, their quality requirements and the costs of nonquality data.
  • Find the management systems that hurt information quality under the guise of productivity but actually increase the costs and reduce profits by passing the problems downstream, creating muda and scrap and rework in other business areas. Challenge them. Change them.

Like the healthy organism, the healthy organization in the knowledge age will operate as a single entity ­ rather than autonomous business areas ­ creating and maintaining accurate information that is shareable in all parts of the enterprise, adding experiences and lessons learned. It will grow, gaining knowledge as it does. And as it gains and shares its knowledge, it will leverage its intellectual assets, increasing its customer satisfaction and shareholder value.
1 Larry English, "Seven Deadly Misconceptions about Information Quality," Parts 1-3, DM Review, June-September, 1999.

2 See also Thomas Davenport, Information Ecology: Mastering the Information and Knowledge Environment, New York: Oxford University Press, 1997, p. 9.

3 Larry English, Improving Data Warehouse and Business Information Quality, New York: John Wiley & Sons, 1999, p. 24.

4 Huan-Tsae Huang, Y. Lee, and R. Wang, Quality Information and Knowledge, Upper Saddle River: Prentice-Hall, 1999, p. 10.

5 Larry English, Improving Data Warehouse and Business Information Quality, p. 53.

6 Shirou Fujita, A Strategy for Corporate Innovation, Tokyo: Asian Productivity Organization, 1997, p. ix. NTT DATA, an independent information services company that was spun off from Nippon Telegraph and Telephone.

7 For an extensive description of quality characteristics, see English, Improving Data Warehouse and Business Information Quality, pp. 141-153.

8 Fujita, A Strategy for Corporate Innovation, p. ix.

9 Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Center for Advanced Engineering Study, 1986, p. 23.

10 Masaaki Imai, Gemba Kaizen, New York: McGraw-Hill, 1997, p. 44. The word gemba means the real place, that is, the place where the real work that adds value occurs. Kaizen and Gemba Kaizen are registered trademarks of the Kaizen Institute.

11 Ibid. pp. 75-85.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access