Question: Should the metadata layer that sits between the data marts and the query/reporting tools be owned by the business or IT and what are the pros and cons of each approach?

 

Sid Adelman’s Answer: It depends on what is meant by “owned.” IT should provide the metadata repository along with the rules about who can add, update and delete from the repository and IT has responsibility for the technical metadata. The business has responsibility for the creation and maintenance of the business metadata including business definitions, business rules, security and valid values and this business metadata would be housed in the metadata repository that is implemented and supported by IT.

 

Tom Haughey’s Answer: Particularly from a data warehouse (DW) point of view, there are several key points about metadata:

 

§      It is vital to the DW;

§      It must be created progressively as the data model is being created;

§      It must include all relevant components; and

§      It must be published in a practical medium.

 

I feel it is important to understand each of these points in order to properly understand where metadata fits in the spectrum of things.

 

As you know, a data model has three main components: the model diagram with its entities, attributes and relationships; metadata, or definitions of everything in the model; and, finally, supplementary business rules that cannot be expressed in the structure of the data model.

 

Metadata for the DW should include definitions of the objects with some sample values, identification of the sources (where did the data come from), description of the business and technical rule used to transform it, timing of the transformation (e.g., data on the last day of the month is not the same as data on the last Friday on the month), appraisal of its quality, standard queries, etc. Metadata is crucial to a successful data model and is indispensable for a DW. Remember, there are both logical and physical data models. There will be metadata for each of them. The physical metadata may be slightly different than logical metadata, because the physical data model may have some differences, mostly due to design differences, trade-offs and additions. In all of this, it is my belief that the business is the best agent to provide the metadata. Even so, it often falls on the data model developer to provide the first cut of the metadata. A convenient way is for the data modeler to gather as many relevant sources as possible, and use them to create the first version of the metadata. This metadata is then passed to relevant business and systems people for them to validate, correct and complete the metadata provided.

 

The term, “owns it,” is ambiguous. As I just said, the business should provide the final version of it. The data architect maintains it in the relevant place, such as with the logical and physical models. The extract, transform and load (ETL) architect maintains transformation metadata. The BI architect maintains BI-specific metadata. The BI teams, collaborating with the other architects, should be responsible to deliver the metadata so that people can meaningfully use it while using BI tools. This is critical for the success of the metadata and even of the BI solution. Do not assume that making the data available through some repository is necessarily a good way to deliver it. Delivery should be seamless in that the information consumer should be able to use it directly from the BI application without having to switch (or even toggle) to a metadata application.

 

The business should be the primary creator of the metadata. In practice a practical way to achieve this is for the data management developer to create a first cut of the metadata, pass it to business people for review and correction and then store it back with the data model.

 

Chuck Kelley’s Answer: Both. The semantic layer that the business sees is owned by the business, although, IT will most likely be implementing what the business decides.IT owns the technical metadata.Let us look at an example.

 

Revenue per available room (REVPAR) is used in the hospitality industry.The business owns the name REVPAR, the calculation (Revenue (in your fav currency) / Number of Available Rooms.IT owns the implementation of the names in the semantic layer (metadata layer as you called it) and making sure that the calculation is correct.What does per available room mean?Is that the number of rooms on the property or is it the number of rooms that would have been able to be sold (maintenance could be happening on some rooms).So, Its role is to make sure that the calculation is being done correctly, then we need to make sure that we understand every component of the equation - without assumptions.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access