Every corporate decision and operation relies in part on the underlying data. In fact, high quality data is as much an asset to the organization as a hard-working employee. It’s time for businesses to recognize that data quality isn’t a place to cut corners. In fact, by taking a page out of the health care book and performing good preventive maintenance on data as well as quick treatment when data quality issues arise, data quality will be better and productivity will increase.

Many factors contribute to why companies struggle to properly manage their data. According to Gartner Research, the volume of enterprise data doubles every 18 months. If a data governance process is not in place, this volume of data quickly becomes unmanageable. Complicating matters are issues with inaccurate, duplicate or out-of-date data. Without stringent quality measures, organizations could be basing critical business decisions on flawed or inaccurate data. After all, analysis of your customer data is only as good as the quality of the data you’re working with.

While most enterprise architects leave the quality of data up to database administrators - with the assumption that if you own the data you’re responsible for its quality - this is not always the most sensible approach. Businesses should adhere to three fundamental best practices to ensure that their data is complete and up-to-date.

Data Profiling: Assessing Your Most Valuable Asset

Data should be viewed as a corporate asset. It has measureable value integral to achieving strategic objectives and gaining a competitive edge. However, for data to really be an asset, it must be used while fresh. To be consistently used, it needs to be complete and regularly refreshed.

First, profile your data. You wouldn’t buy a house without first having it inspected. A qualified inspector will look at the foundation and identify building flaws that could create a problem in the future. You want the same kind of information about your corporate data. Virtually all data quality profiling tools will provide counts on the percentages of fields that are populated, but for real insight, you need to be able to view key data values as well. For example, are there numbers or symbols in fields where only text is appropriate? How many of your unique identifiers (customer number, account number, etc.) are not unique? This information can help you identify outliers, anomalies and other questionable data points and direct you to your organization’s larger data quality issues.

Scrub Data Regularly to Ensure Accuracy

There are four key steps to data cleansing. Taken together, they will ensure that your data provides the best foundation for analysis and informed decision-making across the enterprise.

  1. Format fields. You want consistent terms and formats across a given field. If you are relying on human beings to input the data, this is not happening automatically.
  2. Parse components. Break down strings of data into multiple fields so you can more effectively standardize data elements with greater accuracy.
  3. Check content. Some records include accurate information that is embedded in the wrong fields. Other fields may appear populated but are not accurate (for example, a phone number field that looks like: “111-111-1111”). Your data cleansing process should identify and correct these anomalies so your data is fit for use.
  4. Eliminate duplicates. Identify matches and eliminate duplicate records. Once your data is standardized, this can be done with a high degree of confidence.

 
It cannot be emphasized enough that data cleansing is not a one-time operation. Even the best data gets stale. Also, because customers have unprecedented access to their data (online, over the phone, through the mail, in the store or branch, etc.), there are multiple opportunities for changes and mistakes to enter your system.

An ongoing program that includes both batch and real-time maintenance is the best defense against poor data quality. In batch, run your data through the data cleansing steps regularly and you will be able to correct issues as they arise. Annual data cleansing is a minimal effort. A best practice is to perform this task at least quarterly. Pair batch with real-time data quality applications that validate data as it is entered, serving as a data quality firewall.

Both processes have advantages that complement one another. Managing data quality at the point of entry requires speed and reliability on a transactional basis; batch processes allow for more thorough and complete cleansing.

Take Ownership of Data Quality

Data quality often falls by the wayside because it’s not clear who owns the data.
For most companies, the responsibility lies within the data governance committee. The commitment to sound data quality and security practices must begin at the top of the organization and include stakeholders at every level. Best practices demonstrate that most data governance committees represent exactly this mix.

Data governance is an ongoing commitment. As an organization’s needs change, its data governance policies must be reviewed to ensure alignment. On a day-to-day basis, many organizations are recognizing and embracing the evolving role of the data steward. This role began in the IT arena, but trends indicate that this it is branching out as the level of accountability entrusted to the stewards increases. It is largely the data steward (also a member of the data governance team) who will determine the business rules for an organization’s data quality.

Ensuring your organization’s data quality is an ongoing process. After assessing your data quality and taking steps to fix incomplete, inaccurate or duplicate data, organizations need to create data governance programs. Without a plan and routine maintenance, companies may find themselves having to deal with bad data on a continual basis, which can derail the success of any business initiative. By ensuring that your organization has a solid foundation, you can be confident that all your efforts – whether they’re targeted marketing, managing inventory or even bill production – are based on accurate and complete data.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access