Continue in 2 seconds

Modeling is Like a Three-Legged Race

  • February 01 2004, 1:00am EST
More in

There are many different types of data models that can be utilized in the data warehousing arena. I would like to clarify the types of models that are useful for designing and deploying dimensional structures. This column is not intended to be a discussion about or in any way to compare and contrast the uses of entity-relationship modeling and dimensional modeling. Perhaps I'll discuss that at another time.

This column includes references to detailed dimensional design topics. It is intended to provide a context for the entire modeling effort. Some of you may not be familiar with all of the design topics mentioned this month. I plan to address these topics in the future.

Many data modelers and database administrators have strong personal preferences regarding what to call the different levels of models that are used to design and develop databases. Rather than debate the nuances of the different labels, I would like to propose that there are different levels of detail that are needed. These levels are:

  1. Business perspective to support communication
  2. Logical table structures
  3. Physical table structures

Let's look at what I include in each of these layers. First, you need to have a mechanism to communicate clearly with the business community and to simplify complexity. This is the purpose of the business dimensional model (BDM). This model provides a common ground that allows the business to truly understand how the data will be organized. If the business community does not understand the model, then the model is not right and you need to continue to work on it. There may be isolated individuals who do not understand what this model represents or how it would be used, but most business professionals are savvy enough to understand a BDM. This provides the perspective that closely represents how users will interact with the data via a business intelligence tool. The BDM also provides enough specificity to allow the team to design table structures.
For traditional systems development, there is often a lot of effort required to evaluate a logical data model and translate it into table structures. Most of this effort is focused on designing for performance. When working on dimensional models, this effort is minimized because the entire modeling approach is based upon ensuring fast query performance while maintaining user understanding.

The second level of design is for logical table structures. This is where the BDM is translated into actual dimension and fact tables. There are several choices that need to be made to represent your dimension tables. The factors that must influence your decisions include the needs of your database management system (DBMS) and business intelligence (BI) or data access tool requirements. Yes, I recommend that you tailor your table design to support the needs of your specific technologies! You still have your BDM to represent a technology independent view. Remember that the only perception the business community will have is based upon the BI tool. Because you invested in that technology, be certain to enable it to work well! The primary factor that drives table design recommendations from the technologies is how aggregate navigation is supported. If you do not have a BI tool selected or your tool does not support aggregate navigation, then you should stick with a basic star schema approach. Next, each fact table must be defined with the logical primary key being the composite of all the appropriate dimension keys. Other characteristics of logical table design are defining the data type, length and domain for each column. Finally, you need to develop a recommended set of aggregate fact tables to implement. These are just initial recommendations, which will be refined based upon actual use over time.

The third level of requirements is the detailed physical table design. These activities do not differ significantly from other physical design; however, the database administrator (DBA) must have an understanding of how dimensional structures will be used. This will ensure that any design recommendations do not violate dimensional modeling principles. This type of table design could be shocking to a DBA who does not have any exposure to dimensional modeling. This could result in a big delay in your project while you educate the DBA. The primary activities here are developing an indexing strategy to determine the physical clustering of the data on disk, including partitioning. A summary of the components follows.

Business Data Model
Business Dimensional Model

Business Dimensions
Fact Groups

Logical Table Design
Dimension Table Design

Support for aggregates
Column data types and lengths

Fact Table Design

Column data types and lengths
Determine to include/exclude derived or calculated facts

Aggregate Design

Determine initial set of aggregates to create

Physical Table Design
Dimension Table Design

Assess need for physical partitioning
Determine physical order of columns
Determine physical order of rows ­ clustering
Develop initial indexing strategy

Fact Table Design

Assess need for physical partitioning
Determine physical order of columns
Determine physical order of rows ­ clustering
Develop initial indexing strategy

Your organization can decide how these activities are grouped, what they are called and where to draw lines between team member responsibilities. It is critical to ensure that each of these is addressed in a timely manner. To ensure successful completion of these tasks, it is important for team members work closely together. For many teams, a single person is responsible for all of these activities. In larger organizations, different groups may have responsibility for only one part of these tasks. In these cases, the different groups must work together. This is more like a three-legged race (where everyone's ankles are tied together), rather than a relay (where each person performs their part independently and passes the resulting work off to the next person).

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access