Developing Information Quality Metrics
Information Management Magazine, May 2005
Currently in vogue is the ability to summarize an organization's "business productivity" to senior managers using pithy representations that are expected to carry deep meaning and, at the same time, reduce the attention required to absorb that meaning. Business productivity management systems engage key performance indicators whose values are posted to executive dashboards for the CEO's periodic (be it daily or hourly) review. The intention of these applications is to provide a presentation of the current state of the environment in the context of reasonable expectations. In other words, a business manager wants to have an overview of the "value creation" of the entire system, much the same way a nuclear engineer gauges different metrics associated with the safety status of the nuclear reactors.
In most areas of a business, the metrics that back up the key performance indicators may be relatively straightforward. For example, in a shoe factory, one might gauge the number of shoes coming off the production line, the rate at which shoes are being produced, the number of flawed shoes coming off the line or the number of accidents that occur each day. Each of these metrics may be represented using various visual cues, each of which provides a warning when the performance indicator reaches some critical level.
Advertisement
When it comes to the world of information quality, however, the analogy seems to break down, mostly because there is a disconnect between what is apparently measurable and what the value of that measurement means. For example, one may count the number of times a value is missing from a specific column in a specific table, but in the absence of any business context, it is not clear how those missing values affect the business, or if they even affect the business at all.
Yet we all know that poor data quality does affect the business. Thus, there should be some kind of performance indicator that can capture and summarize the relationship between data that does not meet one's expectations and the organizational bottom line. The challenge, then, is to devise a strategy for identifying and managing "business-relevant" information quality metrics.
What Makes a Good Metric?
More challenging, however, is that the individuals typically tasked with devising good information quality metrics are better trained at data analysis and less skilled in business performance monitoring. Therefore, part of this strategy is to understand the characteristics of a reasonable business performance metric and then explore how to map those characteristics to the measurable aspects of data quality. The following list of characteristics, which is by no means complete, should give some guidance as to how to jump-start the strategy:- Clarity of definition
- Measurability
- Business relevance
- Controllability
- Representation
- Reportability
- Trackability
- Drill-down capability
Clarity of Definition
Because the metric is intended to convey a particular piece of information regarding an aspect of business performance in a summarized manner, it is critical that its underlying definition be stated in a way that clearly explains what is being measured. In fact, each metric should be subject to a rigorous "standardization" process in which the key stakeholders participate in its definition and agree to the definition's final wording. In addition, it is advisable to provide the metric's value range, as well as a qualitative segmentation of the value range that relates the metric's score to its performance assessment.Measurability
Any metric must be measurable and should be quantifiable within a discrete range. Note, however, that there are many things that can be measured that may not translate into useful metrics, and that implies the need for business relevance.Business Relevance
The metric is of no value if it cannot be related to some aspect of business operations or performance. Therefore, every desirable metric must be defined within a business context with an explanation of how the metric score correlates with a measurement of performance. More desirable is if performance measurement can be directly associated with a critical business impact; this is probably the most critical characteristic of a data quality metric.Controllability
Any measurable characteristic of information that is suitable as a metric should reflect some controllable aspect of the business. In other words, the assessment of an information quality metric's value within an undesirable range should trigger some action to improve the data being measured.Representation
Without digressing into a discussion about the plethora of visual "widgets" that can be used to represent a metric's value, it is reasonable to note that one should associate a visual representation for each metric that logically presents the metric's value in a concise and meaningful way.Reportability
From a different point of view, each metric's definition should provide enough information that can be summarized as a line item in a comprehensive report. The difference between representation and reportability is that the representation will focus on the specific metric in isolation, while the reporting should show each metric's contribution to an aggregate assessment. In turn, this allows the manager to evaluate the priority of any issues needing resolution.Trackability
A major benefit of metrics is the ability to measure performance improvement over time. Tracking performance over time not only validates any improvement efforts, but once an information process is presumed to be stable, tracking provides insight into maintaining statistical control. In turn, these kinds of metrics can evolve from performance indicators into standard monitors, placed in the background to notify the right individuals when the data quality measurements suddenly indicate a deviation from expected control bounds.Drill-Down Capability
In recognition of the summarization aspect of a representation of a data quality metric, the flip side is the ability to provide exposure to the underlying data that contributed to a particular metric score. The natural instinct, when reviewing data quality measurements, is to review the data instances that contributed to any low scores. The ability to drill down through the performance metric allows an analyst to get a better understanding of patterns (if any exist) that may have contributed to a low score, and consequently use that understanding for a more comprehensive root-cause analysis. This kind of insight allows your organization to isolate the processing stage at which any flaws are introduced and, in turn, enables you to eliminate the source of the introduction of data problems (instead of the typical, counterproductive reaction of correcting the data values themselves).Measurements of Data Quality
Page 1 of 3.






