A college professor friend of mine who overheard me speaking about the concept of meta data jokingly told me that in his line of work, whenever thinkers or philosophers don't understand something or can't explain it very well, they stick the prefix "meta" in front of the word and call it done (e.g., metatheory, metaphysics and meta-ethics).

I assured him that was not the case with meta data ... that we always know what we're talking about in our field. Lately, however, his comment has been troubling me. Consider, for example, my column last month which presented a framework for deliverables that concern a data architect. If someone were to ask me where meta data fits into that framework, I'd have to reply honestly: It doesn't. I want to address that issue this month.

The Growth of Meta Data and the Great Meta Data Void

Meta data is information that allows you to catalog, sort and find digital content in different ways. Keywords, bibliographic information and file types are examples of meta data. If captured and catalogued properly, the meta data provides an architecture that can be used to store, sort and search for digital content. Our understanding of the importance of meta data grew through the 1970s and early 1980s. Meta data came to be understood as the sum of all information about a system, including databases, application programs and business requirements. It linked conceptual design concepts to the physical objects. It covered all the objects of the data architecture framework. It also covered the objects of the application architecture and the broader systems architecture. The combined pieces make up the metamodel for meta data. Perhaps the biggest impact, at least to the bottom line: Project leaders started planning for 15 percent additional work-hours to create the meta data in the dictionaries.

Fifteen percent overhead is pretty steep. To try to lower that overhead, dictionaries started integrating with computer-assisted development environment (CASE) tools. This integration permitted meta data to be generated from routine systems work automatically, thereby reducing the overhead.

Then client/server arrived, bringing with it a distributed architecture. New CASE tools were needed for these environments. Because a host of new technical issues such as GUI generation and network management were introduced by client/server, attention focused on these new issues instead of meta data. With many new technologies and a sense that pilot applications of the new technologies would not live very long, the 15 percent overhead for meta data was indefensible.

The current sorry state of meta data tool support has left the industry staring at a great void. Stamford-based Gartner Inc. reports, "Less than 10 percent of Global 5000 companies have successfully implemented an enterprise-wide meta data repository with DW meta data management as the first step."1

What's a Company to Do?

What should today's enterprises do about meta data? The first step should be to identify the meta data that will bring more value to the enterprise than the cost of the labor needed to create it. If you're in a fully integrated CASE environment, the full metamodel defined in the 1970s and 1980s is not only feasible, but highly desirable. However, today's diverse, fragmented environments cannot justify the effort required to create and maintain the full metamodel. Proposed system changes would certainly be easier and quicker to evaluate with comprehensive meta data, but the effort to position an organization for quick impact analysis is currently greater than the subsequent savings for most organizations. Regrettably, this is a case of deciding between building something "just in case" it is needed and a "just-in- time" approach. The clear winner across our industry is the just-in-time type of impact analysis, rather than the just-in-case creation of comprehensive meta data.

Business and Technical Meta Data

There is one category of meta data, however, that does justify its creation: the business meta data that describes information available through a data warehouse. Business meta data begins with informative definitions of the data available to users, including business descriptions of the sources and of calculations or transformations that may be applied in the process of moving the data from the sources. It includes search capabilities that allow users to request a list of all data items with similar names, which ensures that users select the correct data item for their queries. It includes context information to allow users to understand the context within which each data item was created. As an overly simple example, there is little value in a data item called "Date of Creation" if we don't know what was being created. Business meta data also includes data on the timeliness of data ­ that is, exactly when the latest update occurred.

Why is business meta data so vital? Without useful descriptions of the data in a data warehouse, business users are hamstrung. They are forced to rely entirely on data warehouse support staff, or on a small number of power users, to navigate the data for them. This situation is not a choice between just-in-case and just-in-time. In fact, the situation actually undermines the entire value proposition of data warehousing and hinders the opportunity to leverage enterprise information for the benefit of the enterprise. It also encourages individual interpretation of the meaning of enterprise data which leads to incorrect findings and questionable decisions.

Technical meta data is the other broad category of meta data. It may be created by select, independent design or implementation tools such as ETL tools, query tools, data design tools or DBMS catalogs. When created by such tools, the meta data is limited to specific meta data needed by the tool. Often, the meta data is not accessible for other purposes. The tool helps the development, maintenance and enhancement process by simplifying the developer interface, but it does little to support enterprise meta data or the integration of the tool's meta data with applications built with other tools.

Coping with Dis- Integration

The importance of meta data will grow as life becomes increasingly complex. Given that complexity, we cannot expect a return to the integrated meta data repositories of the mid-1980s. Rather, we can expect increasing dis-integration. In such a world, how is an organization to respond? I believe the answer is to take a different look at meta data and to concentrate efforts on that component of meta data that brings real value to the enterprise.

Consider meta data as four concentric circles (see Figure 1). At the center is meta data about master data. Moving out, the next ring is meta data about data warehouse data. The third ring incorporates the processes and transformations involved in moving data to and through a data warehouse. The outermost ring is data about all the other applications in the enterprise. You may have observed that the meta data that is most valuable to the enterprise is that which is shared most broadly. Master data cuts across many applications. Additional data stored in a data warehouse is typically stored there because there is a business case for exposing that data to an important user population. Application data that is held in applications other than the data warehouse is the most difficult to justify as meta data.

Taking a Tactical Approach


Figure 1: Enterprise Meta Data

Figure 1 can help organizations think more clearly about their meta data strategy and take a tactical approach for creating and maintaining meta data. The inside circle of "master data meta data" represents the locus of greatest value for most enterprises. As one moves out from the center ­ to data warehouse meta data, process/EAI/ETL meta data and finally to application meta data ­ an organization needs to justify the cost of creating the meta data against the value that the meta data will create. Most organizations today find that they cannot move very far from the center. If that's the case, then there is still value in concentrating efforts on that core component. As environments change or business value changes, organizations can re-evaluate the degree of meta data that is right for them, balancing costs against benefits.

Next month: More about the strategy for phasing meta data creation.

Reference

1. Gartner Inc., "Alternatives to Data Warehouse Meta Data Integration," by Blechar, M. and Strange, K.; November 01, 2001.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access