Companies attempting to successfully implement data governance programs are faced with the fact that individual data stewards work within different business silos or system groups. Often, their efforts are not as coordinated as they could be, and sometimes they dont even know whom the other data stewards are that they should be communicating with. In two previous articles, Agile Data Governance: The Key to Solving Enterprise Data Quality Problems, and Seven Steps to Agile Data Governance, we discussed processes for streamlining data governance. This article is about how organizations that are implementing data governance programs can use new Web 2.0 and other collaboration tools to significantly move the productivity dial by improving communications across geographic, project, business, organizational, system, language and time zone boundaries.
Data steward teams already utilize various technologies to help them identify data patterns, anomalies and other data quality issues. In addition to data tools, data stewards also need tools to help them coordinate efforts, communicate more effectively and achieve better results. These Web 2.0 and collaboration tools, which tend to be readily available, easy to implement and simple to use, make the data stewards job easier while contributing substantive data quality improvements. Theyre also no big secret. Many IT professionals are already utilizing technologies such as wikis, tags, blogs and mashups to communicate and collaborate. They just arent using them in a coordinated way to address data governance issues, which can be easily changed.
When used strategically, Web 2.0 and other collaboration technologies easily and cost-effectively streamline the data governance process.
Wikis
Wikis are open, Web-based forums that help participants communicate and collaborate more effectively. Wikis are the perfect place for companies to document information about data governance projects as well as other cross-functional or cross-organizational projects. Data governance teams can easily leverage wikis to streamline communications across the enterprise. A data governance wiki is an ideal place for data stewards to have an open online dialogue that can be captured in a collaborative fashion.
A data governance wiki is the best forum for reviewing and refining data policies as well as documenting the evolution of these policies throughout a project. Often, policies become static and dont have the impact they should have because, once finalized, they become documents that are simply filed away somewhere and rarely used. Policies that are managed in wikis can be actively used, reviewed and refreshed by critical members of the data governance team. Wikis can also be the repository for documenting data quality problems and tying those problems to the policies governing the data (see tagging below).
In addition, wikis can be invaluable for managing the evolution and dissemination of a companys business rules as well as documenting decision-making rights for specific data. They provide a central location where data stewards and other IT people from around the company can view rules and evaluate whether they are upholding them. Wikis can also be used to define roles and responsibilities and members of decision-making groups, such as the data governance board or an implementation team. Information about board or team members can be posted with links to photos and information about members that is stored in the corporate directory. Meeting minutes, project plans and timelines can also be included here.
Tagging
Also known as folksonomies and social classification, collaborative tagging is a great way for people to create their own annotations and descriptors about a given subject. Popular examples of sites that use tags include Digg, De.licio.us, Diigo and Ma.gnolia. These sites all enable users to create their own taxonomy of terms and tag content (and bookmarks) with these terms, which can then be used to find other content that share the same tags.
Tagging creates a semantically rich environment where people can define problems from the bottom up using their own words instead of relying on a top-down, set list of predefined terms. This allows the nuances of a data problem to be defined by creating many different tags for a problem. Because tagging is a user-driven approach to organizing content, one data steward might tag a data quality problem as a message fault, another may see it as a data quality issue and another may identify it as a data validation problem. If someone is attempting to fix a problem and they perform an analysis on the systems involved, they could discover causal correlations through tag analysis. For example, they might discover that every time a data validation error is tagged, there is also a tag issued for a message fault and for data quality. This discovery could result in a dialogue between the three analysts that would help to better resolve the issue.









