Understanding the Cost-Benefit Quality Curve
Anyone who begins a major data quality enhancement project is likely to have high expectations for the improvements that will result. The users of the existing data will understand all its failings and will be eager to work with the enhanced quality that will result from a project. I have often heard people use words like 99.9 percent accuracy, 99.9 percent completeness or real-time data access. While these are laudable goals, they are surprisingly difficult to achieve. And its never been clear to me that the end users actually need these quality levels. As John Kay observed, In a world of imperfect knowledge and irresolvable uncertainty - of unknown unknowns - the quest for exact knowledge gets in the way of useful knowledge.1
The aim of this column is to help you set realistic expectations for the delivery of useful knowledge, not perfect information. You may worry that promising too little in quality improvements will risk damaging the business case for your project. Let me assure you from years of watching and implementing these types of projects that promising too much will ensure dissatisfaction with the final results. The goal is quality improvement targets that are achievable and that deliver useful knowledge to your information customers.
Because every data improvement project has different characteristics, let me introduce a concept for the sake of discussion. The concept is a just noticeable difference (JND). William James, one of the founding fathers of modern psychology, introduced the term to define the smallest difference that we can perceive between things like tones or skin touches. It applies equally well to data quality: what is the smallest change in a quality attribute that your users will perceive as an improvement? For example, if it takes your current customer information system 30 days to incorporate new customers names and addresses into your sales customer relationship management (CRM) system, what would be the smallest amendment that would cause the sales team to notice an improvement - a reduction to 10 days, five days or overnight? In other words, I am using the JND concept to describe the smallest improvement that would have a noticeable effect on how your data users derive benefit from quality enhancements.
Many of you may have considered that going from 30 days to 10 days in loading and releasing the new customer data is probably a matter of workflow enhancements. This is relatively easy to do when compared to what would be required to reduce the time from five days to overnight. Overnight updates would require some combination of workflow, databases and systems enhancements or replacements.
This low-hanging fruit pattern is very common in data quality enhancement programs. There are always new processes, new systems and tweaks to databases that can be quickly implemented. And, your information customers will notice real improvements in quality when you release them. The cost of subsequent quality improvements will increase rapidly, however. In most cases, the relationship between cost and each JND is close to an exponential relationship, which is shown in Figure 1.
Figure 1 simply illustrates the cost of initial development. There is also the cost of maintaining the data going forward. The incremental costs of maintenance are not exponential, but the greater the promised quality with the new system, the greater the costs will be. In short, each JND of data quality will bring something close to exponentially greater development costs and dramatically higher maintenance expenses.
Take the time to operationalize the quality dimensions that matter to your information customers. Is it timeliness of updates, accuracy of relationships in hierarchies or completeness of customer records? In other words, what are the missing quality attributes that are inhibiting the success of your information customers? Once you have defined these customer-facing dimensions, determine how much improvement is really necessary. Remember, you and your information customers are searching for useful knowledge, not perfect data.
Perfect data is never an option, so determine what quality levels will drive JND improvements in your customers perceptions of the information at a cost that your organization can afford. Almost all information end users understand the cost-benefit analysis when you describe it for them because this kind of relationship is common in the world. The key is to have the discussion before you define the project deliverables.
- John Kay. Beware the fruitless search for sharp predictions. Financial Times, October 17, 2007.