Continue in 2 seconds

What is the purpose and role of dimensional modeling during data warehouse development process?

  • Anne Marie Smith, Joe Oates, Chuck Kelley, Clay Rehm
  • October 02 2006, 1:00am EDT
More in

Q: What is the purpose and role of dimensional modeling during data warehouse development process and what are the strengths and weaknesses of dimensional modeling?

Clay Rehm's Answer:

The main purpose is to identify common business terms used by your users that are used as query identifiers when they run queries against the data.

The weakness is that you may think you have captured all of the requirements when in reality you may have not. However, being aware of this fact may prevent this from happening. The key to dimensional modeling is to have all of the business users involved - not just the ones managing the effort and not just the ones who will be using it. This is a perfect time to get all of the possible stakeholders together and find out the questions they intend to ask of your data warehouse.

Anne Marie Smith's Answer:

Dimensional modeling is the type of modeling used in data warehouse/data mart design since it is intended to provide the capabilities of end-user querying that is the purpose of a data warehouse. A dimensional model is composed of one table with a multipart key, called the fact table, and a set of smaller tables called dimension tables. Each dimension table has a single-part primary key that corresponds to one of the components of the multipart key in the fact table. This construct is frequently called a "star schema" since the graphical representation can resemble a star. One strength of the star schema (dimensional model) is that it can accommodate various types of user queries so the design for the data warehouse database can be performed without concern for the exact types of queries that users may employ. Another advantage of the star schema is that the model can be extended simply by adding additional rows to both the fact and dimension tables, no need to add new tables or rejoin existing tables. Weaknesses of the dimensional model are that they need to be built after having completed traditional ER (entity-relationship) models so that you have a proper view of the data and its usage in the source systems, and many organizations do not take this essential first step - they go right to dimensional modeling and lose the insight that having an ER model would give. Also, learning how to do a good dimensional model takes time, and some techniques of DM are different from ER modeling, so the modeler has to remember that they are working on a dimensional model so they don't fall back into a more traditional ER design.

Joe Oates' Answer:

I would say that the main purpose of dimensional modeling is to make queries easier to write and more efficient than an equivalent third normal form (3NF) design. It should be pointed out that 3NF designs are suited for transaction processing where the goal is the efficiency of insert as opposed to efficiency of reporting.

Multidimensional models consist of "snowflake schemas" as well as star schemas. Typically, data marts with a single system has the source can use a star schema. But when there are multiple source systems with, for example, multiple product structures and multiple people or organizations that can be involved in a transaction, a simple star schema is not sufficient. Ralph Kimball categorizes these real-world situations as the heterogeneous data problem and data requiring "helper" tables.

The key strengths that I have found for multidimensional modeling and implementation include:

  1. It is much easier to write a query using a dimensional design with six or eight tables (or views masking an underlying snowflake schema) that includes two or three hierarchies than writing a query for the equivalent 3NF design that may have 20 or more tables
  2. It is much easier to talk to business people using a multidimensional design than a more complicated 3NF design
  3. Many query tools expect a multidimensional design
  4. In my experience, it's much easier to add to a multidimensional design and it is to a 3NF design.

And Ralph Kimball's The Data Warehouse Lifecycle Tool Kit, he describes the strengths of the dimensional design on page 147.

Chuck Kelley's Answer:

This depends on your technical architecture. If your data warehouse is based on a grouping of stars as data marts (Ralph Kimball style), then dimensional modeling plays a huge role in the development of the data warehouse. If your data warehouse is based on a data warehouse where access is primarily from the data marts (Bill Inmon style), then dimensional modeling plays a lesser role, but still quite important because it is where the largest communities of our business users access the data.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access