You know those little stickers they put on new clothing to indicate that someone has inspected it? They increase your confidence in the item's consistency, quality, conformity, and so on. Wouldn't you like to have a similiar mechanism for your data warehouse? Data stewardship is one way of increasing the quality of the data warehouse and your confidence in it.
The word "steward" comes from Old English and means "keeper of the sty--a sty ward." Hopefully, your data warehouse is not a sty; and to keep it from becoming one, consider the benefits of implementing data stewardship.
The Need for Data Stewardship
Corporations are demanding better and better sources of data. The explosive growth of data warehousing and sophistication of the access tools are proof that data is one of the most critical assets any company possesses. Data, in the form of information, must be delivered to decision-makers quickly, concisely and, more importantly, accurately and in an integrated format.
The data warehouse is an excellent mechanism for getting information into the hands of decision-makers. However, it is only as good as the data that goes into it. Problems occur when we attempt to acquire and deliver this information. A substantial effort must be made in defining, integrating, cleansing and synchronizing the data coming from the myriad of operational systems producing data throughout the corporation. Who should be responsible for this important task? The answer for a growing number of companies is a new business function called data stewardship.
What is Data Stewardship?
Data stewardship has as its main objective the management of the corporation's data assets in order to improve their reusability, accessibility and quality. It is the data stewards' responsibility to approve business naming standards, develop consistent data definitions, determine data aliases, develop standard calculations and derivations, document the business rules of the corporation, monitor the quality of the data in the data warehouse, define security requirements, and so forth. (See Table 1 for a list of the data integration issues handled by data stewards.)
|Data Integration Issues|
|Data stewards are responsible for:|
|Standard business naming standards|
|Standard entity definitions|
|Standard attribute definitions|
|Business rules specification|
|Standard calculation and summarization definitions|
|Entity and attribute aliases|
|Data quality analyses|
|Sources of data for the data warehouse|
|Data security specification|
|Data retention criteria|
This new data about data, or meta data, created by data stewards can then be used by the corporation's knowledge workers in their everyday analyses to determine what comparisons should be made, which trends are significant, that apples have indeed been compared to apples, etc.
Just as the demand for a data warehouse with good data has grown, the need for a data stewardship function has likewise grown. More and more companies are recognizing the critical role this function serves in the overall quest for high-quality, available data. Such an integrated, corporate-wide view of the data provides the foundation for the shared data so critical in the data warehouse.
Qualities of a Data Steward
Data stewards are well respected by the end-user community because of their thorough understanding of how the business works. They have the confidence of both the IT and end-user communities that they are not creating meta data and business rules that are impossible to implement or counter to the corporation's culture.
Table 2 lists the skill sets for a data steward. These are divided into two sets: technical skills and interpersonal skills. For most organizations, interpersonal skills may actually be the more important of the two.
|Skill Sets Needed For Data Stewards|
|Technical skill set:|
|Basic understanding of data modeling|
|Basic understanding of DBMS|
|Strong understanding of data warehouse|
|Interpersonal skill set:|
|Solid understanding of the business|
|Excellent communications skills|
|Well-respected in the subject area|
|Well respected for the knowledge of the overall corporation|
The types of technical skills may seem more clear cut than the interpersonal ones. Data stewards will need some knowledge of IT systems and DBMSs employed in the corporation. This ensures that the data stewards remain grounded in the reality of what is technologically feasible. Secondly, data stewards should be able to understand both logical and physical data models, how entities relate to each other, what redundancy is and why normalization rules are important. They are not, however, usually responsible for the creation of these models; that usually falls into the domain of the data administration group.
Interpersonal skills are sometimes overlooked when choosing a data steward; yet these skills are invaluable. Many times the data stewards will find themselves in the situation of trying to facilitate an agreement between two differing factions. Data integration can be a highly charged issue affecting the very core of how a company will continue to do business. Because of this, the data steward must be able to reach a consensus wherever possible or at least a reasonable compromise. Secondly, the data steward often must perform the difficult role of organizational change agent, smoothing the way for changes that will inevitably happen as integration of data occurs and the corporation evolves its business processes.
The Scope of a Data Steward
A typical corporate data stewardship function should have one data steward assigned to each major data subject area. These subject areas consist of the critical data entities or subjects such as customer, order, product, market segment, employee, organization, inventory, etc. Usually, there are about 12-15 major subject areas in any corporation. As an example, one data steward would be responsible for the customer subject area and another would be assigned to the product subject area.
The data steward responsible for a subject area usually works with a select group of employees representing all aspects of the company for that subject area. This committee of peers is responsible for resolving integration issues concerning their subject area. The results of the committee's work are passed on to the data administration and database administration functions for implementation into the corporate data models, meta data repository and, ultimately, the data warehouse construct itself.
Just as there is a data architect in most data administration functions, there should be a "lead" data steward responsible for the work of the individual data stewards. The lead data steward's responsibility is to determine and control the domain of each data steward. These domains can become muddy and unclear, especially where subject areas intersect. Political battles can develop between the data stewards if their domains are not clearly established. Secondly, the lead data steward must ensure that resolutions to difficult issues are obtained in a reasonable time. If resolution appears impossible and a deadlock has occurred, it is the lead data steward who presents the issue to the steering committee (high-level executives of the corporation) for resolution.
Finding a Good Data Steward
Data stewards generally come from either the end-user community or the IT department. Subject matter experts from within the end-user community make good data stewards. They are quite knowledgeable about specific parts of the corporation. However, they may need training in some of the technical aspects of data models and IT systems. In addition, they must be familiar with business areas other than their own (and known for their knowledge). Otherwise they can be perceived as biased toward their perspectives on the data and, therefore, not representative of the entire enterprise.
Data modelers from the IT data administration function may also make good data stewards. They understand the technical issues of data integration and usually acquire a great deal of exposure to the business community while modeling the business rules, data entities and attributes. In addition, they generally have good rapport with end users and database administrators alike. However, these resources must have the respect of the end-user community and the authority to make decisions on their behalf. Often, they may be perceived as not knowing enough about how the business functions and, therefore, are discounted or ignored by the business community.
Perhaps the easiest way to slip into a data stewardship function is by assigning one of the critical end users on your first data warehouse project the role of maintaining the subject area most known to them. As each progressive data warehouse project is completed, another data steward is added to the growing list. For example, if the first project dealt heavily with customer profiling and demographics, the end user most concerned with that area could become the informal steward of the customer subject area.
The second project, dealing with sales channel analysis, produces the second data steward who is responsible for the sales channel subject area. The customer data steward must approve any changes made to the customer subject area, due to the second project's analyses. And so on, slowly filling out the other subject areas.
Eventually, this "informal data stewardship" process should become a formal function within the corporation. Through this informal process, data stewardship can gain significant visibility and perceived value, making the conversion to the formal process much easier and acceptable.
How do you differentiate the roles of data stewards, data administrators and database administrators? Each of these functions must have its own roles and responsibilities spelled out clearly to avoid any confusion. There is little overlap in terms of each group's responsibilities; however, there is a great deal of collaboration and communication that must take place to ensure that the data assets of the corporation are used to provide the highest return on investment. Table 3 lists the specific roles for each function--data stewardship, data administration and database administration.
|Roles/Responsibilities of Data Stewards, Data Administrators and Database Administrators|
|Resolving data integration issues;|
|Determining data security specifications;|
|Documenting data definitions, calculations, summarizations, etc.;|
|Maintaining/updating business rules;|
|Analyzing and improving data quality;|
|Ensuring the alignment of the business requirements with the IT support systems.|
|Translating the business rules into data models;|
|Maintaining conceptual, logical and physical data models;|
|Assisting in data integration resolution;|
|Maintaining meta data repository.|
|Generating physical database schema;|
|Performing database tuning based on usage of the data;|
|Creating database backups and archives when data is no longer needed;|
|Planning for database capacity;|
|Implementing data security requirements;|
|Maintaining business aliases for end-user usage.|
The Importance of Data Stewardship
The data stewardship position probably has the highest profile within the corporation of the three functions mentioned. Why? Because the data steward acts as the conduit between IT and end users, helping to align the business needs with the IT systems supporting them. They have the difficult, but very rewarding, task of guaranteeing that one of the corporation's most critical assets--its data--is used to its fullest capacity.
For data stewardship to succeed in your corporation, a new incentive paradigm must be developed--one that rewards people on the basis of horizontal integration rather than only vertical or "bottom line" success. As long as a department or division is solely focused on its bottom line, it will see no benefit in changing its business practices to integrate data and business rules with another department or division. The new incentives should be driven by the success of the groups to resolve integration issues, to develop unified definitions and to change business practices to conform to the new standards.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access