For example, Figure 1 is a dimensional PDM, also known as a star schema. But is there another data model, the implementation data model, that comes after this to show this design will not be implemented in flattened tables but instead through database views on top of a data warehouse's more normalized set of tables? For this challenge, I asked the Design Challengers, "Does your organization recognize both a PDM and an IDM or just the PDM, and why?"
I believe there actually is both a PDM and IDM. To answer this challenge, we need to ask two basic yet essential questions about the PDM: Who are we building the model for? What is the purpose of the model? Traditionally, the PDM is built for highly technical roles, such as developers, support personnel, database administrators and report writers. At times, the audience can also include data architects and modelers. The purpose at a high level is the same purpose as for any data model: communication.
So which is a better communication tool, a star schema such as the model shown, or a set of views shown over a normalized set of tables? You may be thinking it depends on which of the technical roles needs to read the model, and you would be right.
A database administrator, for example, would most likely prefer to see the actual implementation model - the more complex one showing views over tables. A data warehouse architect, concerned with preserving the underlying integrity of a design, would also need to see the more complex design. Any role concerned with improving performance or supporting an application would also need to be aware of the implementation data model.
However, do report writers need to see the additional complexity of views over tables, or can they build queries based upon the simpler star schema structure with just boxes and lines? Most likely, assuming the table/view naming is consistent, they can use the simpler structure and do not need to know how it was actually implemented. In fact, we can go further and say in the case of a star schema, even a less technical resource such as a business user can view and understand the boxes and lines and definitely not require knowledge of how it was actually implemented.
So to summarize my thoughts, if the audience for the data model does not need to see how the structures were implemented, but instead just needs to know what those structures look like, the physical data model is sufficient. Those customers that need to know how the structures were implemented can refer to the more detailed implementation data model. So, in many projects, especially data warehouse projects, there is a need for both types of models.
The Design Challengers had varying opinions.
Allan B. Kolber, senior enterprise architect, says the physical/implementation distinction is depicted in the Zachman Framework where row 4 is technology platform-type specific (e.g., hierarchical, network, relational) and row 5 is technology specific (e.g., DB2, Oracle, Teradata).
Hillary Helm, data analyst, says, "At my current client, there is a difference between these two models. The PDM does not include indexes or views. This is largely because we have a large amount of views that are essentially copies of the PDM tables. Their purpose is to union the log table structure with the tables containing the active records. Keeping all of this in the PDM would be redundant."
Wayne Kurtz, enterprise architect, responds, "The roles assigned the responsibility to 1) model data and 2) implement a database, or portion thereof, are often held by people whose expertise and experience are quite different. This is especially true in large organizations."
Mark Walker, information architect, writes, "We don't create any kind of implementation model. I don't see the value in defining an implementation model when you can extract the implementation details from the DBMS. I would think that you would spend more time trying to keep the implementation model in sync than it is worth."
Jeff Lawyer, lead data modeler, states, "In this particular case, I would use the data modeling tool in order to model the business views needing implementation, resulting in the above PDM, which I would use for review, validation and consensus. Then, I would revert to the data model representing the database with the more normalized set of tables and create the views in that data model that are needed to implement the star schema view."
Until the next challenge.
Steve Hoberman is currently a data modeling consultant and instructor. He taught his first data modeling class in 1992 and has educated more than 10,000 people about data modeling and business intelligence techniques since then. Steve balances the formality and precision of data modeling with the realities of building software systems with severe time, budget, and people constraints. In his consulting and teaching, he focuses on templates, tools, and guidelines to reap the benefits of data modeling with minimal investment. Steve is the author of five books on data modeling, the founder of the Design Challenges group, and inventor of the Data Model Scorecard.