My last few columns dealt with managing data quality. The scope included data stewardship, data profiling, data improvements and the data model. These are all needed to manage data as an asset. Another important tool in managing data is the application system and database that store information about this asset. Within the technical community, this is often called "metadata." The name for this application system puts a technical spin on the system, and that, in turn, contributes to difficulties organizations encounter in trying to gain executive and business support for this important tool, which is needed to manage data.
In this column and the next, I'll explore the metadata needed in each of the data lifecycle steps - planning, acquisition, maintenance, dissemination and disposal. In subsequent columns, I'll explore the application system, database (often called the metadata repository), and roles and responsibilities. It is important to recognize that the metadata application system and database should be developed incrementally. Its content and development sequence should be driven by business priorities.
The Zachman Framework is a useful vehicle for identifying the metadata that's needed for each of the lifecycle planning steps.1 The Framework, shown in Figure 1, demonstrates the type of information that's needed for the development of any complex product, and the metadata management system and database can certainly be classified as a complex product. The Framework recognizes that information needed at different points in the development cycle differs and that at each point, information is needed to answer the six interrogatories: what, how, where, who, when and why. While the sequence for answering these questions may be dictated by circumstances, the priority or importance of each question is ultimately equal in the creation of a complete description.
Metadata About Planning
Planning is the step that is performed before an asset is even brought into the enterprise. The metadata needed to answer the six critical questions for the planning step includes:
- What lists the major data subjects and required data elements, along with definitions and quality expectations.
- How describes the business processes that should be used for acquiring the data assets.
- Where provides information on the way in which the data should be organized. This is represented by the subject area model and business data model described in my March column.
- Who defines the roles and responsibilities involved in managing the data asset as a whole as well as those for each of the groups of data elements. This is not restricted to the electronic data. It includes information about the business responsibilities associated with each of the data lifecycle steps.
- When provides information on the required data timeliness from both a business and technical perspective.
- Why explains the reasons for having the data in the organization. This, in turn, provides information on business rules to drive priorities and to enable business decisions on the value of pursuing complex processes to acquire or protect specific sets of data.
To properly manage data as an asset, each of these should be addressed even before the data is acquired. The information about each of these planning dimensions is acquired using the metadata capture mechanisms, stored and maintained in the metadata repository and disseminated using the metadata access mechanisms.
Once we complete the planning for data, we're ready to acquire it. Acquisition metadata also needs to address the six dimensions.
- What includes the data acquisition rules for initially bringing data into the organization. Within the business intelligence (BI) world, this includes the information about the data extraction from the source system environment.
- How provides information on the business rules for bringing information into the organization. Within the BI environment, this includes the mapping documents for data integration and aggregation.
- Where is addressed by information about the business locations at which data is acquired. For BI, this also includes information about the source systems that are used.
- Who identifies the roles and responsibilities within the organization associated with acquiring the data. For BI, this includes information about the people responsible for maintaining the extract, transform and load processes and for resolving any issues that may arise.
- When provides information on the timeliness of data acquisition. For the business, this provides information about the points in the business processes when data is actually acquired. (For example, quantity ordered is acquired at the time an order is processed, but quantity shipped is not generated until an order is fulfilled.) For BI, this is information on the data acquisition schedule and job flow.
- Why provides information on the reasons for acquiring the data so that decisions can be made about the business and technical process used both initially and in the BI environment.
Metadata management is managing information about the data assets of the enterprise. The development of the metadata management capabilities should, therefore, follow traditional approaches for systems development. The Zachman Framework has proven to be a valuable tool during the development process. In this column, I applied it to describe the metadata needed for data asset planning and acquisition.
- J.F. Sowa and J. A. Zachman. "Extending and Formalizing the Framework for Information Systems Architecture." IBM Systems Journal, Vol. 31, No. 3, 1992.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access