Continue in 2 seconds

Positioning Meta Data

  • July 01 2002, 1:00am EDT

What is the position of meta data relative to the scope of data architecture? Readers have challenged the absence of meta data from the definition of data architecture as I have defined and extended data architecture in the last several months. Others have suggested meta data should be included as a part of data management.

I defined data architecture as an orderly arrangement of parts to organize, store, access and move data. Data management describes the processes needed to manage the parts of the data architecture. Graphically, I think of data architecture as resting on a backplane of data management. I am indebted to my colleague Mike Green for creation of the data management framework shown in Figure 1.

However, meta data is more comprehensive than data architecture and data management. Meta data describes the components of applications, of the technology and of the business itself. We can extrapolate from the data management framework in Figure 1 and conceive of similar frameworks for applications and technology, with the application architecture sitting on a backplane of application management and the technical architecture sitting on a backplane of technology management. The full breadth of meta data spans these three areas. It is part of all three management functions as shown in Figure 2.

Figure 1: Data Management Framework

Figure 2: Position of Meta Data

Meta data is part of management functions rather than just being part of the architecture because the management of meta data and the management of processes that use meta data are absolutely vital to the successful use of meta data. Without successful use, there is no point to collecting meta data beyond the very provincial value of meta data to individual tools that are powered by the meta data they collect within themselves.

Any meta data approach that relies on the manual capture of meta data is doomed to failure. At the height of popularity of data dictionaries in the late '70s and early '80s, Accenture developed considerable experience with processes for standardizing, gathering and enforcing meta data. We found that application development teams had to be expanded 15 percent to provide the manual support needed for the effort. Ensuring that meta data was maintained during the production cycle of an application required that the organization responsible for meta data had total veto power over production moves for application changes to ensure that meta data documentation was maintained. Few companies have an appetite for such measures. However, it is reasonable to capture meta data when it can be captured automatically or when the creation of meta data automatically creates the objects being described by the meta data.

Today's tools are attempting to address this problem. DBMSs create meta data automatically when databases are created (called the catalog); however, the meta data is available only to DBAs. Data modeling tools create meta data automatically as models are defined, but again the meta data is generally only available to data modelers. ETL tools create meta data as part of the process of defining transformation rules and process flows. Query tools create meta data as part of the act of configuring the tool for use by users against specific data. Ditto for CASE tools and business process tools in the application management space. The same is also true for configuration management, schedulers and performance management tools in the technology management space.

None of these solutions address meta data for applications built in programming languages such as COBOL, C or Java. Most of them do not permit creation of manual meta data, even if companies were willing to make the investment to have someone create such meta data in the tool.

The fragmentation of meta data across tools creates a serious problem for data redundancy. Each tool is likely to be used by personnel in different parts of the organization. Each user is likely to define meta data independently of the others, creating today's morass of redundant, conflicting definitions and fragmented meta data.

We are finally beginning to see a glimmer of light at the end of the tunnel. XML is providing a standard vehicle for exchanging data between tools, and the common warehouse metamodel (CWM) is providing the standard definition, at least within a data warehouse context. However, these architecture solutions do not address the management issue of how to ensure that meta data is complete and consistent. If several organizations across an enterprise are all allowed to define meta data within their own tools, then XML and the CWM are helpless to resolve the syntactic discrepancies. Management processes must be adopted that agree on a single point of entry for meta data, which can then be shared across all tools through the use of the architecture solutions being provided by vendors. This is a process we can initiate today, in anticipation of the possibilities appearing on the horizon.

It is the critical nature of coordinated meta data management that moves meta data beyond data architecture and causes it to span data, applications and technology.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access