JUN 1, 2004 1:00am ET

Related Links

The CRM Shift
February 3, 2012
Salesforce Eyeing SMB Customer Service Gap with Desk.com
February 1, 2012
Tableau Twists Platform for More Sharing
January 19, 2012

Web Seminars

6 Key Things to Fast Track your Mobility Strategy
February 23, 2012
Why Getting Started in MDM Doesn't Have to Be Difficult
February 29, 2012

The Bane of CRM: Data Quality

Print
Reprints
Email

As corporate IT departments across the planet struggle to implement customer relationship management (CRM), business intelligence (BI) and data warehousing (DW), trade magazines and convention presentations echo the same painful stories. Years and dollars later, the systems are in place but the results are not worth the effort. The most common underlying reason - poor data quality.

Quite frankly, most sane people don't find cleansing data anymore fun than cleaning the toilet. When given the time, analysts are happy to look at data to find patterns of error, programmers are happy to code validations when they are requested and users try to enter things correctly, but mistakes still happen. Only the most compulsive among us will take the time to research a data problem, determine if and where bad data elements might be in use and find a way to correct it. So, clean data moves in and out of the system smoothly, but dirty data hangs around like laundry on a teenager's floor.

System implementation projects are planned without allocating enough time to cleanse dirty data. Often the existence of dirty data is acknowledged, and cleanup may be attempted at some level. Most often though, most dirty data is converted, loading new systems with all the old data problems, soon to be joined by a whole crop of new data anomalies. New operational systems simply create bad data faster. Often, CRM systems sitting on top of them are, at best, no improvement over old systems and, at worst, they are dismal, expensive failures.

Industry experts and vendors alike were quick to see this trend, publicize it and tout solutions. Most articles and presentations focus on one or two aspects of data quality. I thought it might be helpful to attempt to pull together both the most common lessons learned, as well as some rules to use as a starting point for those just beginning or those beginning again. I have been deeply imbedded in these activities, and I've had lots of opportunity to talk with others doing similar things. Here's a summary of the most common observations.

Key to Success

The key to successful customer CRM, BI and data warehousing (DW) ventures is data quality. The most elegant system in the world will fail without it. The simplest systems can deliver amazing results if they operate on clean data.

Cleaning is Ongoing

To achieve a high level of data quality requires three efforts: initial cleansing, error prevention and ongoing cleansing. Of these, the most important is ongoing cleansing. Without it, your shiny new system will gradually corrode and fall into ruin.

Unfortunately, most efforts focus on initial cleanup and then die out. Imagine what your house would look like if you only cleaned it when you moved in.

Integrate Quality Efforts

Data quality must be integrated into business process. You can expect users to have the desire to maintain high quality, and they will have tolerance for some measure of additional effort to do so. How much tolerance will depend on your organization's ability to demonstrate the benefits of CRM, BI and/or DW to them, and your ability to provide online validations to prevent mistakes.

Unfortunately, you should not expect your IT department to embrace this. As deadlines approach, quality slips silently off the radar. If left to IT, data quality will lay there in the dust (along with documentation tasks) unless a special effort is made to rescue it. If asked, most IT folks believe this responsibility lies entirely with the user community and requires no effort from them.

Data Analysts and Administrators

Most people will agree that data quality is important; however, people who are truly passionate about data quality are rare. I usually hear them described in less complimentary terms, but I've chosen "rare" because I haven't met many of them. I am one who has been masquerading as a data warehouse architect for some time. The most successful data quality efforts are managed by one of these people.

Cost

To achieve a sustained high data quality level requires tools and time. The complexity of these tools and the amount of time required to keep data clean is consistently and significantly underestimated by everyone except those whose passion is data quality. If you are dealing with international data, the complexity is nearly incomprehensible for any one individual.

Manage Expectations

No matter what you do or how well you do it, perfection is not attainable. Managing this expectation across IT management, and to a lesser extent across the user community, is critical. Each pass made to the data results in successive approximations of perfection - if your dataset was static, you might achieve near perfection. In real life, if you work really hard and have a bit of luck, you can hope to maintain a state where you can clean data slightly faster than it gets screwed up.

Given this somewhat irreverent and admittedly dismal view of the data quality world, how would you start? There are lots of consultants and vendors who will be happy to help you. There may be project managers in your organization who have some ideas as well. But the reality is that your data is unique to your organization and the way you handle it should be as well. The deck is stacked against you - there is no "one size fits all" solution; if there were, you probably wouldn't have a data quality problem in the first place.

In addition to the observations I've made, I have come up with some rules we wish someone had given us before we started. For those of you who like numbers, I've attempted to quantify a few of these things. Much of my data quality experience has involved company and person names and addresses, and they make good examples since nearly everyone has customers. I considered calling these guidelines, and you may consider them that way, as long as you realize that there is a logarithmic relationship between breaking them and future pain points.

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.