Companies attempting to successfully implement data governance programs are faced with the fact that individual data stewards work within different business silos or system groups. Often, their efforts are not as coordinated as they could be, and sometimes they don’t even know whom the other data stewards are that they should be communicating with. In two previous articles, “Agile Data Governance: The Key to Solving Enterprise Data Quality Problems,” and “Seven Steps to Agile Data Governance,” we discussed processes for streamlining data governance. This article is about how organizations that are implementing data governance programs can use new Web 2.0 and other collaboration tools to significantly move the productivity dial by improving communications across geographic, project, business, organizational, system, language and time zone boundaries.


Data steward teams already utilize various technologies to help them identify data patterns, anomalies and other data quality issues. In addition to data tools, data stewards also need tools to help them coordinate efforts, communicate more effectively and achieve better results. These Web 2.0 and collaboration tools, which tend to be readily available, easy to implement and simple to use, make the data steward’s job easier while contributing substantive data quality improvements. They’re also no big secret. Many IT professionals are already utilizing technologies such as wikis, tags, blogs and mashups to communicate and collaborate. They just aren’t using them in a coordinated way to address data governance issues, which can be easily changed.


When used strategically, Web 2.0 and other collaboration technologies easily and cost-effectively streamline the data governance process.




Wikis are open, Web-based forums that help participants communicate and collaborate more effectively. Wikis are the perfect place for companies to document information about data governance projects as well as other cross-functional or cross-organizational projects. Data governance teams can easily leverage wikis to streamline communications across the enterprise. A data governance wiki is an ideal place for data stewards to have an open online dialogue that can be captured in a collaborative fashion.


A data governance wiki is the best forum for reviewing and refining data policies as well as documenting the evolution of these policies throughout a project. Often, policies become static and don’t have the impact they should have because, once finalized, they become documents that are simply filed away somewhere and rarely used. Policies that are managed in wikis can be actively used, reviewed and refreshed by critical members of the data governance team. Wikis can also be the repository for documenting data quality problems and tying those problems to the policies governing the data (see tagging below).


In addition, wikis can be invaluable for managing the evolution and dissemination of a company’s business rules as well as documenting decision-making rights for specific data. They provide a central location where data stewards and other IT people from around the company can view rules and evaluate whether they are upholding them. Wikis can also be used to define roles and responsibilities and members of decision-making groups, such as the data governance board or an implementation team. Information about board or team members can be posted with links to photos and information about members that is stored in the corporate directory. Meeting minutes, project plans and timelines can also be included here.




Also known as folksonomies and social classification, collaborative tagging is a great way for people to create their own annotations and descriptors about a given subject. Popular examples of sites that use tags include Digg,, Diigo and Ma.gnolia. These sites all enable users to create their own taxonomy of terms and tag content (and bookmarks) with these terms, which can then be used to find other content that share the same tags.


Tagging creates a semantically rich environment where people can define problems from the bottom up using their own words instead of relying on a top-down, set list of predefined terms. This allows the nuances of a data problem to be defined by creating many different tags for a problem. Because tagging is a user-driven approach to organizing content, one data steward might tag a data quality problem as a message fault, another may see it as a data quality issue and another may identify it as a data validation problem. If someone is attempting to fix a problem and they perform an analysis on the systems involved, they could discover causal correlations through tag analysis. For example, they might discover that every time a data validation error is tagged, there is also a tag issued for a message fault and for data quality. This discovery could result in a dialogue between the three analysts that would help to better resolve the issue.


Tagging can be used to help with data governance issues. Tags can be added to wiki discussions to track system and data quality issues that need attention, and participants can click on them to link to further discussions. Tags can also be used to document which systems are participating in a customer data governance project and identify the level of compliance and security of each system. “Tag clouds” can also be created from documents, URLs and other sources to make it easier to discover and analyze content, such as data governance policies.


User Ranking Systems


These type of evaluation tools are used by a variety of community and ecommerce sites, such as Netflix, which employs a five-star rating system that enables users to rank movies, and, which ranks products according to a five-star system based on consumer reviews. The reason ranking systems have been so successful is that most people have opinions, but in most cases lack a method or location to express them. User ranking systems provide an easy, visual way to determine popularity, worth and value. Businesses can utilize them to help assess the success of data governance projects or the value, risk or importance of a data quality topic.


Why not allow data stewards to create a list of topics that people can view and rank on an internal Web site or a wiki? These rankings could give companies a way to measure the success or failure of a given project or determine the priority for new initiatives. Rankings could be used to assess the quality of data repairs, fixes and remediations. In addition, they could be used to document observations about throughput, reliability, availability, security, trustworthiness and accuracy of any system, interface, service, message, adaptor/connector or report.


Ranking systems could also be opened up to customers and partners, which would allow companies to start aligning projects or technologies directly with business value to determine ROI. Giving customers and partners a method to rank services also provides a simple way to determine customer satisfaction and answer usability questions.




By now, practically everyone knows what a blog is. But how can blogs be used to help improve data quality? They can educate and inform employees, or groups can use them to debate unresolved issues or to continue discussions between meetings. For educational purposes, for example, a blog could provide information about why one project was chosen over another. Blogs could also provide updates on projects and actively request reader feedback.


Data governance boards could assign different data stewards to blog each week about the problems they are trying to solve and the projects they are working on. Over time, this type of blog would help inform data stewards, data governance constituents and other readers about how the company is working to solve global data quality issues.


RSS Feeds


RSS feeds are a great way to push information to people. Whether it is information about new training and educational materials, updates to project milestones in a wiki, final results of a user ranking survey or a weekly podcast highlighting a unique data quality issue, RSS feeds help streamline and improve the efficiency of information distribution.




Mashups are a type of Web application that combine data from multiple sources into a single, rich integrated application to allow people to get the information they need much faster than they ever could before. Using mashups, corporations can quickly provide data quality, data validation or MDM services within a single Web interface without the time and expense of IT integration projects or the need to purchase new systems.


Wouldn’t it be great if data governance team members could use mashups to show real data in relation to a data policy wiki? For example, if there was a policy that governed postal addresses for customers, a dashboard could be invoked from a wiki that shows the number of address exceptions captured, new addresses added to the system and sources of policy violations. That same mashup could pull up the metadata specification and policy for postal addresses and compare them to actual data to ensure compliance. Another example of how a mashup could be used is in conjunction with a policy governing the definition of a customer. In this case, the mashup could invoke a service that runs a report within the data policy wiki that shows the number of new customers within a given period. Mashups could also be used to populate a dashboard displaying key performance indicators such as number of orders tracked, new customers and number of postal address acceptances or rejections.




Workflow technologies predate Web 2.0 but are still powerful collaboration tools and should actively be used to inject huge efficiency improvements into the data governance process. With workflow tools, corporations can assign a problem, such as resolving proper address details about a specific customer contained in multiple records, and track that issue through resolution. Workflow technologies ensure that data quality issues are managed consistently and completely from beginning to end.


With workflow technologies, relevant IT or data personnel might receive an email message with a URL linked to a Web page containing an explanation of a data quality problem that they need to resolve. Or, they might receive a text message, voicemail or instant message that alerts them to an issue that they need to handle. The data quality system can be configured to send an employee a new message detailing a new problem in the workflow periodically. Or the system could be set up as a relay mechanism so that as each problem arises, the employee originally given the task of resolving a conflict could take action or tag the page and explain why that issue would be better solved by someone else. The employee could then pass a new message with the link to the tagged page to the next person in the data governance chain. Throughout the process, the workflow system would track the problem as it proceeds to resolution and send reports and alerts to designated personnel, as well as tracking time to completion, escalations and exceptions along the way.


Communication That Leads To Better Problem Solving


With the Web 2.0 and collaboration technologies described in this article, businesses can increase the success of their data governance initiatives while giving all participants a powerful voice in the process. Each of these methods can be used in a variety of ways to fit the corporate culture of individual companies. The trick is to coordinate their usage across all divisions, departments and geographical locations to ensure everyone contributes.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access