“As a single stone causes concentric ripples in a pond,” Martin Doyle commented on my blog post There is No Such Thing as a Root Cause, “there will always be one root cause event creating the data quality wave.  There may be interference after the root cause event which may look like a root cause, creating eddies of side effects and confusion, but I believe there will always be one root cause. Work backwards from the data quality side effects to the root cause and the data quality ripples will be eliminated.”

Martin Doyle and I continued our congenial blog comment banter on my podcast episode The Johari Window of Data Quality, but in this blog post I wanted to focus on the stone-throwing metaphor for root cause analysis.

Let’s begin with the concept of a single stone causing the concentric ripples in a pond. Is the stone really the root cause? Who threw the stone? Why did that particular person choose to throw that specific stone? How did the stone come to be alongside the pond? Which path did the stone-thrower take to get to the pond? What happened to the stone-thrower earlier in the day that made them want to go to the pond, and once there, pick up a stone and throw it in the pond?

My point is that while root cause analysis is important to data quality improvement, too often we can get carried away riding the ripples of what we believe to be the root cause of poor data quality. Adding to the complexity is the fact there’s hardly ever just one stone. Many stones get thrown into our data ponds, and trying to un-ripple their poor quality effects can lead us to false conclusions because causation is non-linear in nature. Causation is a complex network of many interrelated causes and effects, so some of what appear to be the effects of the root cause you have isolated may, in fact, be the effects of other causes.

As Laura Sebastian-Coleman explains, data quality assessments are often “a quest to find a single criminal—The Root Cause—rather than to understand the process that creates the data and the factors that contribute to data issues and discrepancies.” Those approaching data quality this way, “start hunting for the one thing that will explain all the problems. Their goal is to slay the root cause and live happily ever after. Their intentions are good. And slaying root causes—such as poor process design—can bring about improvement. But many data problems are symptoms of a lack of knowledge about the data and the processes that create it. You cannot slay a lack of knowledge. The only way to solve a knowledge problem is to build knowledge of the data.”

Believing that you have found and eliminated the root cause of all your data quality problems is like believing that after you have removed the stones from your pond (i.e., data cleansing), you can stop the stone-throwers by building a high stone-deflecting wall around your pond (i.e., defect prevention). However, there will always be stones (i.e., data quality issues) and there will always be stone-throwers (i.e., people and processes) that will find a way to throw a stone in your pond.

In our recent podcast Measuring Data Quality for Ongoing Improvement, Laura Sebastian-Coleman and I discussed although root cause is used as a singular noun, just as data is used as a singular noun, we should talk about root causes since, just as data analysis is not analysis of a single datum, root cause analysis should not be viewed as analysis of a single root cause.

The bottom line, or, if you prefer, the ripple at the bottom of the pond, is the Stone Wars of Root Cause Analysis will never end because data quality is a journey, not a destination. After all, that’s why it’s called ongoing data quality improvement.