Best Practices for Repairing Your Most Valuable Asset
Information Management Newsletters, March 5, 2010
Every corporate decision and operation relies in part on the underlying data. In fact, high quality data is as much an asset to the organization as a hard-working employee. It’s time for businesses to recognize that data quality isn’t a place to cut corners. In fact, by taking a page out of the health care book and performing good preventive maintenance on data as well as quick treatment when data quality issues arise, data quality will be better and productivity will increase.
Many factors contribute to why companies struggle to properly manage their data. According to Gartner Research, the volume of enterprise data doubles every 18 months. If a data governance process is not in place, this volume of data quickly becomes unmanageable. Complicating matters are issues with inaccurate, duplicate or out-of-date data. Without stringent quality measures, organizations could be basing critical business decisions on flawed or inaccurate data. After all, analysis of your customer data is only as good as the quality of the data you’re working with.
While most enterprise architects leave the quality of data up to database administrators - with the assumption that if you own the data you’re responsible for its quality - this is not always the most sensible approach. Businesses should adhere to three fundamental best practices to ensure that their data is complete and up-to-date.
Advertisement
Data Profiling: Assessing Your Most Valuable Asset
Data should be viewed as a corporate asset. It has measureable value integral to achieving strategic objectives and gaining a competitive edge. However, for data to really be an asset, it must be used while fresh. To be consistently used, it needs to be complete and regularly refreshed.
First, profile your data. You wouldn’t buy a house without first having it inspected. A qualified inspector will look at the foundation and identify building flaws that could create a problem in the future. You want the same kind of information about your corporate data. Virtually all data quality profiling tools will provide counts on the percentages of fields that are populated, but for real insight, you need to be able to view key data values as well. For example, are there numbers or symbols in fields where only text is appropriate? How many of your unique identifiers (customer number, account number, etc.) are not unique? This information can help you identify outliers, anomalies and other questionable data points and direct you to your organization’s larger data quality issues.
Scrub Data Regularly to Ensure Accuracy
There are four key steps to data cleansing. Taken together, they will ensure that your data provides the best foundation for analysis and informed decision-making across the enterprise.
- Format fields. You want consistent terms and formats across a given field. If you are relying on human beings to input the data, this is not happening automatically.
- Parse components. Break down strings of data into multiple fields so you can more effectively standardize data elements with greater accuracy.
- Check content. Some records include accurate information that is embedded in the wrong fields. Other fields may appear populated but are not accurate (for example, a phone number field that looks like: “111-111-1111”). Your data cleansing process should identify and correct these anomalies so your data is fit for use.
- Eliminate duplicates. Identify matches and eliminate duplicate records. Once your data is standardized, this can be done with a high degree of confidence.
It cannot be emphasized enough that data cleansing is not a one-time operation. Even the best data gets stale. Also, because customers have unprecedented access to their data (online, over the phone, through the mail, in the store or branch, etc.), there are multiple opportunities for changes and mistakes to enter your system.
An ongoing program that includes both batch and real-time maintenance is the best defense against poor data quality. In batch, run your data through the data cleansing steps regularly and you will be able to correct issues as they arise. Annual data cleansing is a minimal effort. A best practice is to perform this task at least quarterly. Pair batch with real-time data quality applications that validate data as it is entered, serving as a data quality firewall.
Both processes have advantages that complement one another. Managing data quality at the point of entry requires speed and reliability on a transactional basis; batch processes allow for more thorough and complete cleansing.
Take Ownership of Data Quality
Data quality often falls by the wayside because it’s not clear who owns the data.
For most companies, the responsibility lies within the data governance committee. The commitment to sound data quality and security practices must begin at the top of the organization and include stakeholders at every level. Best practices demonstrate that most data governance committees represent exactly this mix.
Data governance is an ongoing commitment. As an organization’s needs change, its data governance policies must be reviewed to ensure alignment. On a day-to-day basis, many organizations are recognizing and embracing the evolving role of the data steward. This role began in the IT arena, but trends indicate that this it is branching out as the level of accountability entrusted to the stewards increases. It is largely the data steward (also a member of the data governance team) who will determine the business rules for an organization’s data quality.
Ensuring your organization’s data quality is an ongoing process. After assessing your data quality and taking steps to fix incomplete, inaccurate or duplicate data, organizations need to create data governance programs. Without a plan and routine maintenance, companies may find themselves having to deal with bad data on a continual basis, which can derail the success of any business initiative. By ensuring that your organization has a solid foundation, you can be confident that all your efforts – whether they’re targeted marketing, managing inventory or even bill production – are based on accurate and complete data.
Navin Sharma is the Director of Global Product Strategy, Data Quality for Pitney Bowes Business Insight. He has more than 10 years of experience in data management and helping companies with modeling, analysis, design and implementation of their data management strategies. Mr. Sharma manages the ever-expanding Pitney Bowes Spectrum Technology Platform products and solutions.
For more information on related topics, visit the following channels:







