Question: I am writing a thesis on data warehousing. I would like to know how to decide when to implement Kimball or Inmon's approach. When is it better to have normalized data to create data marts and when is it better to have dimensional data? Are there some criteria or guidelines?


Chuck Kelley’s Answer: Hoping not be bombarded by emails, I still don’t get the difference between Kimball’s and Inmon’s approach. Kimball looks at the data from the usage of the business and hence concentrates on data marts and star schemas, which is the right thing to do.


Inmon on the other hand looks at it from a data management prospective. That is, how you manage the data, which is the right thing to do.


From a high level, Kimball and Inmon say the same thing. Extract data from your sources (and external data) and put them in a big blob (Kimball calls it a staging area and Inmon calls it a data warehouse). Next, extract the data required to solve business issues and load them into a data mart. Yes, I know that there are some little nuances that separate them, but on the whole, I think they are the same, except for the perspective.


Joe Oates’ Answer: First of all, in the current Inmon approach, the enterprise data warehouse (EDW) is normalized and has denormalized or dimensional data marks off of the EDW for ease of querying. Also, there is a lot of confusion over what a “dimensional” model is.


Initially, dimensional designs were described as being strictly “star schemas” with all of the dimension tables as flat tables without many-to-many relationships. As dimensional modeling has matured, permissible snowflaking (a normalization technique) has become part of dimensional modeling as well as “helper or bridge tables” which are associative or junction tables that resolve a many-to-many relationship. An example of permissible snowflaking is the “heterogeneous product” problem. An example of a helper table is multiple insured drivers associated with a single insurance policy. Both of these are covered in The Data Warehouse Toolkit Second Edition by Ralph Kimball et al. The former example is found on page 213 and the latter is found on page 318.


So, the bottom line is that, while there are still real differences, the Inmon and Kimball approaches have grown closer over time.


While having successfully done data warehouses using both approaches, I personally have found that a combination of the two approaches works well for enterprise data warehouses where there is value in things like supertypes and subtypes. On the other hand, the dimensional approach is almost always adequate for data marts.

Larissa Moss’s Answer: Both of Steve Hoberman’s books have great guidelines for data modeling: Data Modeler’s Workbench (ISBN 0-471-11175-9) and Data Modeling Made Simple (ISBN 0-9771400-0-8). You can also sign up for his Design Challenges.