Companies that need to improve enterprise data quality are looking to data governance programs to help achieve their goals. In the past, data governance emphasized a time- and labor-intensive, top-down approach that did not yield immediate results. Today, new agile practices have evolved that incorporate streamlined processes and technologies that make data governance an effective way to solve many pressing data quality problems. In a prior article, Agile Data Governance: the Key to Solving Enterprise Data Quality Problems, we discussed why a top-down approach doesnt work and gave an overview of how to implement an agile approach to data governance. This article describes the seven steps companies should follow when embarking on an agile data governance project.
Before the seven-step process begins, companies need to assemble a small, core data governance board made up of executives who can authoritatively represent the entire organizations business goals. A good place to start when selecting board members is with existing audit and compliance groups or other organizations within the company responsible for upholding regulatory guidelines. If these formal organizations dont exist, then companies should identify senior people within the organization who understand the value of data and will realize direct benefits from helping to fix the data problems. These people are often executives who run business units, manage manufacturing and distribution, or act as chief marketing or privacy officers. Once the board has been assembled, it needs to identify the biggest data problems the enterprise faces and decide which problem to fix first. When the first project has been determined, the organization is ready to embark on the seven steps it needs to follow to achieve success.
Step One: Selecting the Project Implementation Team
The data governance board should assemble the project implementation team, which will consist of five to eight members. The implementation team should be made up of managers who will benefit from fixing the specific type of data being addressed. The best way for the board to identify these individuals is to ask the IT organization for a list of systems involved in the project and the business owners of the data in each system. These data owners or a subset of this group should make up the implementation team, which will work together to ensure that the project gets completed in a timely fashion. For example, if customer data were being resolved, the team would be made up of the data owners of systems involved in the customer data lifecycle.
Step Two: Defining the Size and Scope of the Data Problem
The data governance project implementation team first needs to clearly define the business problems it is trying to solve by improving data quality. For example, if the problem is inconsistent customer data across business silos, the mission may be to ensure that all customer data is accurate and coordinated in all divisions of the company. Success would mean that the sales order-entry people have access to the same information as the customer support people and the accounting people. Specifically, the goal could be that whoever looks at customers across the entire company sees the same customer information when they need access to it. To achieve this goal, the systems that would need to be interrogated from a data quality perspective might be: finance and accounting, customer relationship management (CRM), fulfillment, data warehouse and any ecommerce, supply chain or enterprise data integration (EDI) systems that may be in use, etc.
Step Three: Drafting the Data Steward Team
Once the project implementation team members have defined the business problems they are attempting to solve, the systems that process the data and the timing of the project, they need to draft a data steward team. The best people for the team are those with the most knowledge of the data being addressed and who are also capable of overseeing the investigation and remediation of data problems. Data steward teams are usually made up of people within each division who work directly with one or more source systems and are intimately familiar with each systems data problems. For a customer data project, for example, the systems would be those that capture customer data, interact directly with customers, integrate data from external sources, place orders from sales reps or capture information used by CRM analysts.
Step Four: Validating or Disproving Project Assumptions
The first task for the data steward team is to look at the candidate set of systems identified in step two and dig down into those systems to determine exactly where data quality problems occur. The team should look at data feeds, message content, files that are passed between systems, extract, transform and load (ETL) jobs, etc. The analysis will identify where most, if not all, balls are being dropped, or where programmers have made incorrect logic assumptions, such as when an address coming from this system is always a billing address.
Once the teams evaluation is complete, the data stewards should provide the implementation team with an estimate of how long it would take to fix all the problems. This estimate will almost always be longer than the timeline for completing the project allows. Because every problem cant be fixed while meeting the timeline, the implementation team will usually ask the data stewards to recommend which problems to fix that will solve the most problems in a timely manner.
At this point, the team usually has one of two options for approaching the project. The first would be to solve data quality problems for a single kind of business transaction, such as order management. For this option, the single domain of data being focused upon would be fixed in all systems that participate in that business transaction.This is usually a fairly straightforward way to derive business ROI. The other approach would be to pick one kind of data to fix, such as customer name and address, and touch all of the systems to fix just that one specific kind of data. Data steward teams should evaluate both options, determine which will get closest to achieving the projects business goals within the allotted time frame and make recommendations to the implementation team on the best approach for the project.
Step Five: Establishing New Policies for Handling Data
To help eliminate or prevent data problems within and between systems, the data steward team must identify policies that need to be created or modified. Then they need to determine how these policy changes will impact the organization and its systems and conduct bench tests of the policy changes. The implementation team should be responsible for securing policy change approvals from the data governance board.
The data steward team needs to ensure it is defining policies for system-independent data and transaction-specific data quality. They must also address and solve often-contentious cross-organizational data sharing issues. When participating systems have unique data integrity constraints that do not represent shared business rules, data stewards should make sure that system-specific constraints do not leak into shared business policies and negatively impact other systems.
Even when an organization takes a straight-line data lineage problem - such as one from the time when a customer shows up on a Web site to order something, all the way through the process where theyre billed, shipped, confirmed and supported, to when their data is analyzed - that process is going to cross organizational teams and most likely organizational boundaries. Step five is all about the politics of ensuring that any executive who must compromise to facilitate these changes has approved the processes.
Step Six: Enlisting Internal IT Groups and Kicking Off Implementation
After the list of new business rules has been created, the data governance and steward teams will call on enterprise architects from internal IT groups to help them design the best solution for the project, before identifying specific problems and going about making fixes. It is possible that each new business rule cannot be implemented in the time allotted to complete the project. Both teams will rely on the enterprise architects to help them determine which rules are most essential to fixing the business problems and how to minimize project scope and prioritize implementation to meet the timeline.
Beware that because data governance projects are usually not part of a programmers day-to-day job responsibilities, specific repairs are often made without the requisite rigor. This quick and dirty approach can introduce additional data quality problems now or later, which is why it is important that an architect be involved in overseeing data quality fixes and creating designs for how business rules should be implemented. It is also critical that the architect include metrics reporting as part of the project so that exceptions and error rates are captured and reported as data quality errors occur.
Step Seven: Comparing Results, Evaluating Processes and Determining Next Steps
Once the project is completed, the last step is to look at the results of the changes and evaluate whether you did the right thing. There are many different ways to measure results and success, depending on the type of project and the data and systems involved.
I often hear organizations saying something like: Well, we took care of 18 of the 52 business rules to solve our customer data problem, and the process was successful, so now lets begin another project to go back and tackle the next set. Normally companies will iterate through one type of data until they have completed as much data quality resolution as makes sense, based on the ROI on these projects. However, sometimes tackling another type of data makes more sense for the company. For example, they might say, With this first customer data project, we solved more than 50 percent of our problems. Now we are going to do the same thing with our product data. When we have completed the first product data project, then we will determine whether to fix the remaining customer or product data issues or focus on another new data problem.
Creating Agile Data Governance is a Reality
Many companies today are successfully completing agile data governance projects. The key to their successes is not trying to boil the ocean. Instead, choose a project that will solve one of the organizations most pressing data problems, and manage that project to completion within less than a year. Enterprise-wide data problems are not simple problems that can be solved as one large project. Expect to solve a set of problems with one project, then go back and evaluate, revise and repeat the process until your data quality achieves a level that makes the most business sense.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access