I recently came across a great quote on data quality by Ken Orr in "The Good, the Bad and the Data Quality" from the Cutter Consortium: "Ultimately, poor data quality is like dirt on the windshield. You may be able to drive for a long time with slowly degrading vision, but at some point, you either have to stop and clear the windshield or risk everything."

"Dirty data" can be a problem for any type of system, but data quality and MDM are inextricably linked, because the net purpose of MDM initiatives is to deliver a single source of truth on one or more master data domains containing accurate, complete, timely and consistent data. Without early, systematic attention to high levels of data quality (plus the right data quality tools and solid data governance to resolve the issues that inevitably come up) your master data hub will simply be a fast, automated way to shoot yourself in the foot.

In 2001, Gartner analysts Scott Nelson and Jennifer Kirkby reported that ignoring data quality is the number one reason for CRM project failures. I think this is true for MDM projects as well. Without robust data profiling and data cleansing efforts in your project, you're likely to end up with an MDM repository that the executives and users in your organization don't trust. With this in mind, I have three real-world recommendations for incorporating data quality into your MDM initiative.

1. Profile Early and Often

A key process in building a strong business case for MDM is to use the results of thorough data profiling to find the most serious data quality problems across the enterprise, attach dollar amounts to the cost of leaving the data dirty or to the benefits of fixing it, and then making sure your MDM initiative incorporates technology that can actually deliver those fixes.

By understanding the data and the corresponding levels of data quality in your source systems early on, you can avoid nasty surprises later in the project. Make sure to document everything you discover, particularly the metadata (information about the information). Organizations often don't have any written documentation on major systems, so what you discover in the process of profiling the data will be valuable in itself.

2. Cleanse Your Data Automatically Where Possible

Make sure your MDM hub has data cleansing capabilities, either built in or through integration with one of the leading data quality tools like DataFlux, Business Objects, IBM, Informatica or Trillium, or with smaller players like Pitney Bowes Group 1, Human Inference or AMB New Generation Data Empowerment.

By employing either a built-in or external data quality tool, you'll be able to develop business rules to automatically fix common errors, cleanse and standardize data before it ever goes into the hub, deduplicate source system data and verify and correct addresses. Automating this will save money and allow you to concentrate your human resources on the "gray area" problems, where humans do a better job than computers.

3. Create an Ongoing Data Governance Program

You started with in-depth profiling of the source system data you'll be loading into your hub and went on to use a data quality tool to standardize, validate and correct the data before loading it into the hub. Now you've got to spend some serious time on the people and process aspects of MDM. That is, you need to develop an ongoing data governance program.

Typically, a data governance program consists of business data stewards and the IT stewards who support them from a technical perspective. Additionally, a data governance board is responsible for resolving issues and establishing and managing the MDM hub. This board defines the policies and procedures for data governance processes and the roles and responsibilities for the data governance organization. The business and IT stewards report to the data governance board and are responsible and accountable for day-to-day improvements in data quality as well as longer-term, more strategic programs for managing the enterprise's critical information assets, including the expansion to other data domains.

This continuous improvement approach requires tracking clearly defined metrics that measure levels of data quality over time as well as the benefits produced by the MDM and data governance initiative, to make sure the program is achieving the expected ROI.

By building a data quality mind-set into your MDM program, you'll increase your chances of success dramatically and have more clearly articulated reasons for bringing together source data from around the enterprise, cleansing and matching it, and loading it into an MDM hub where it will be proactively managed by a team of data stewards for the benefit of the rest of the organization.