Gordon Hamilton recently emailed me with an excellent recommended topic for a data quality blog post:

It always seems crazy to me that few executives base their ‘corporate wagers’ on the statistical research touted by data quality authors such as Tom Redman, Jack Olson and Larry English that shows that 15-45% of the operating expense of virtually all organizations is WASTED due to data quality issues.

So, if every organization is leaving 15-45% on the table each year, why don’t they do something about it? Philip Crosby says that quality is free, so why do the executives allow the waste to go on and on and on? It seems that if the shareholders actually think about the Data Quality Wager they might wonder why their executives are wasting their shares’ value. A large portion of that 15-45% could all go to the bottom line without a capital investment.

I’m maybe sounding a little vitriolic because I’ve been re-reading Deming’s ‘Out of the Crisis’ and he has a low regard for North American industry because they won’t move beyond their short-term goals to build a quality organization, let alone implement Deming’s 14 principles or Larry English’s paraphrasing of them in a data quality context.

The Data Quality Wager

Gordon Hamilton explained in his email that his reference to the Data Quality Wager was an allusion to Pascal’s Wager, but what follows is my rendering of it in a data quality context (i.e., if you don’t like what follows, please yell at me, not Gordon).

Although I agree with Gordon, I also acknowledge that convincing your organization to invest in data quality initiatives can be a hard sell. A common mistake is not framing the investment in data quality initiatives using business language such as mitigated risks, reduced costs, or increased revenue. I also acknowledge the reality of the fiscal calendar effect and how most initiatives increase short-term costs based on the long-term potential of eventually mitigating risks, reducing costs, or increasing revenue.

Short-term increased costs of a data quality initiative can include the purchase of data quality software and its maintenance fees, as well as the professional services needed for training and consulting for installation, configuration, application development, testing, and production implementation. And there are often additional short-term increased costs, both external and internal.

Please note that I am talking about the costs of proactively investing in a data quality initiative before any data quality issues have manifested that would prompt reactively investing in a data cleansing project. Although, either way, the short-term increased costs are the same, I am simply acknowledging the reality that it is always easier for a reactive project to get funding than it is for a proactive program to get funding — and this is obviously not only true for data quality initiatives.

Therefore, the organization has to evaluate the possible outcomes of proactively investing in data quality initiatives while also considering the possible existence of data quality issues (i.e., the existence of tangible business-impacting data quality issues):

  1. Invest in data quality initiatives + Data quality issues exist = Decreased risks and (eventually) decreased costs
  2. Invest in data quality initiatives + Data quality issues do not exist = Only increased costs — No ROI
  3. Do not invest in data quality initiatives + Data quality issues exist = Increased risks and (eventually) increased costs
  4. Do not invest in data quality initiatives + Data quality issues do not exist = No increased costs and no increased risks

Data quality professionals, vendors, and industry analysts all strongly advocate #1 — and all strongly criticize #3. (Additionally, since we believe data quality issues exist, most “orthodox” data quality folks generally refuse to even acknowledge #2 and #4.)
Unfortunately, when advocating #1, we often don’t effectively sell the business benefits of data quality, and when criticizing #3, we often focus too much on the negative aspects of not investing in data quality.

Only #4 “guarantees” neither increased costs nor increased risks by gambling on not investing in data quality initiatives based on the belief that data quality issues do not exist — and, by default, this is how many organizations make the Data Quality Wager.

How is your organization making the Data Quality Wager?