Continue in 2 seconds

Meta Data Management Life Cycle Reviewed

  • December 18 2003, 1:00am EST

Meta data does not just appear out of nowhere nor does it just fade away mystically. Meta data is managed around the life of an asset. One of the hard lessons we have learned over the past few years is that the value of meta data is slowly degraded over time for various reasons such as quality, staleness, lack of use, etc. We can loosely define an asset as any person, place or thing within the technological community. Examples of assets include databases, logical models, physical models, XML structures, components, documents, metrics, systems, interfaces, etc. Figure 1 provides a high-level view of the meta data management life cycle around an asset.


Figure 1: Meta Data Management Life Cycle

The asset itself can be described as a container of data, information, knowledge and/or wisdom that needs to be surgically removed. There is a process that will acquire the meta data information from the asset. This process can be an automated extraction process or done by hand. Performing the data load by hand can be used in conjunction with a extraction utility and in most cases is required in order to fill in the information gaps. A third option may be fairly obvious and that is to integrate a tool or collection of tools into the system development life cycle. This would solve 90 percent of the issues we have and push meta data into the most active role possible. However, in a large enterprise the environment is not as homogeneous has people would lead us to believe. In addition, the odds are that the majority of the technology you have is not built upon a current set of standards which makes the automatic enterprise integration nearly impossible, if not extremely expensive. While these processes are fairly well known and documented by various publications, the next series of steps is a source of much confusion and strife.

The divergence of thought comes from the value generated from the passive and active utility built around the asset. Passive utility can be defined as the utility of publishing, indexing, searching and result generation of meta data information. Many experts argue about the limited value of the passive meta data, but we have many examples of where this type of utility is not only valued but demanded. It is widely recognized that an organization’s most valuable knowledge, its essential intellectual capital, is not limited to information contained in official document repositories and databases – scientific formulae, "hard" research data, computer code, codified procedures, financial figures, customer records, and the like. (Bobrow & Whalen, 2003). However, in order to develop the know-how, ideas and insights of the community at large, meta data must be managed at every stage of the asset. Since passive utility is the discovery and knowledge based reuse of meta data information, it stands to reason that passive utility must be delivered first. Active utility without information is simply pointless. When you review Figure 1, it should become apparent the importance of getting the meta data right.

Getting it right means that getting accurate, complete and contextual information from an asset and then providing access to this information across the organization.

Is active utility a bad thing? On the contrary, active utility is like shooting your age in golf or winning the tournament. The hours and hours of hitting practice balls is the passive data collection activity. The payoff and the glory come from the active utility. Our only point is that you can’t have the latter without the former. Active utility is simply taking the meta data information and creating a new value proposition for the business or technical community. Some examples of active utility include:

  • Impact analysis across the asset population
  • Cross-reference and implied/derived meaning
  • Dynamic data exchange (XML)
  • Real-time metrics
  • Web services and the utilization of meta data-driven architectures
  • Dynamic reuse of asset information (i.e., screen/report field lookup)
  • XML file validation using DTD and schemas

In fact, active utilization may create a new asset in the form of new functionality. For example, providing the ability to cross search an asset collection is analogous to bundling products that deliver new utility to the customer. Hence, the return arrow back to the asset inventory for the active utility process. (See Figure 1.) Therefore, not only can a meta data services group catalog technical assets, they can also create them.
The final area of Figure 1 is the information decay arrow. What this means is that information that stays within the repository will decay indicating that the accuracy of the data is only 100 percent valid for a period of time. Why? The most obvious reason is that the technological community is constantly changing. Even the low-level data constructs are changing. Suppose we take a snapshot of the logical, physical and operating system view of a database. How long will this snapshot be accurate? Perhaps a better question is that how long before the next DBA modifies the data structure or the modeler updates the text on a field? What we do know is that the longer information sits in a repository the greater chance that this information is not only inaccurate but could lead to erroneous decisions from the end user perspective. A content aging strategy should be a part of every meta data implementation. Content aging simply provides the administrator with which information hasn’t been updated in the past 30 days or what ever time period is appropriate to the business. Contacts can then be made to determine if the information is to be removed or updated.

What a great research question: What is the rate of decay for information? Think about information collected on you: address, credit score, medical, etc. You, as a human being, are constantly changing and, therefore, the information about you is constantly changing. Your tastes, goals and plans are changing as you move into different stages of life. So, we know information decays but at what rate? Perhaps there is a data half life out there waiting to be discovered. While we are not sure what the rate of information decay is, we can slow the rate down by increasing the usage of meta data information in both passive and active frameworks. In real estate, the three most important words are location, location, location. In meta data, it’s quality, quality, quality. In that order!

It takes a great deal of both individual and collective energy to have a successful year in organizations like yours and mine, especially in the world of data architecture. As the year comes to a close, I want to thank all of you for your support for the concepts of meta data, data resource management, architecture and the like. We’re looking forward to another year of excitement, change and opportunities. It is the body of knowledge that we are expanding and perhaps there is no greater calling within the technical community. Thank you for your help and best wishes for a happy, prosperous and healthy new year.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access