As Low as You Can Go
A modeler on your team is building a consumer complaint dimensional model. The business requirement is to allow users to view the number of complaints at different levels of detail, the lowest level required being Region, Month and Brand. The modeler, however, is unsure whether more detail might be required in the future and has built the model at the lowest level which is Consumer, Day and Product (see Figure 1). An obvious advantage to this model is that it can answer more detailed questions. What are the other advantages to a dimensional model being at its lowest level? What are the drawbacks?
Taking it to the Next Level
The main advantage to adding more detail is that the users will be able to answer more specific questions. Database Engineer Steve Turnock summarizes: "User questions are unpredictable, and if you apply business rules to summarize the fact table, you have lost the ability to answer new user questions." For example, after Mary learns that there are 50 complaints on the CUSoon Sunglass Brand in December 2008 from the northwest, she might drill in to more details to learn which product within this brand is generating the most complaints, or which day within December 2008, or which city within the northwest.
Other advantages of providing the lowest level are:
- Greater consistency with other applications. Data Architect Mark Taylor explains this advantage: "The dimension tables will be more reusable with other fact tables. The three dimensions in the example are all highly likely to be conforming dimensions within a wider data model, so modeling them at the lowest level now will save amending their design later."
- Faster load time. Data Architect Bob Mosscrop III mentions that the data load time could be faster because the loading process is doing more extract and load, and less transforming. That is, there is little or no processing time spent summarizing and sorting.
- Greater query freedom. Rakesh Nandish, information architect, and Don McMunn, consultant, both consider the ability for users to play in the detailed data to be a large advantage. Don says it "supports better coverage for information mining and ad hoc data analysis."
- Fresher data. Data Modeler Jan Cohen thinks more granular data usually means fresher data because of daily updates.
- Easier restatements. Senior Technical Architect Sai Koduri emphasizes a very big advantage is having the ability to restate and recreate history if business rules change, because we will be storing all of the atomic building blocks.
However, just like there is really no such thing as a free lunch, there are also drawbacks to providing the lowest level:
- More data quality issues. The more data we bring in, the greater the chance becomes we'll get bad data. Phrasing this in a more positive tone, Don says, "More quality assurance steps and processing will be required to keep the additional detail synchronized."
- More complexity. A dimensional model should be user-friendly, and with more detail comes more complexity in design.
- Retrieval performance concerns. More data could lead to more time spent retrieving data. Data Architect Mark Taylor says, "The lowest level of detail demands the highest level of storage space. Additionally, assuming we are not providing any summary tables, all summarized reporting will require on-the-fly aggregation calculations, which will be more processor intensive, and therefore, query performance could be an issue."
- Political hurdles. Solutions Architect Murali Vishnuvajhala comments, "At times it can be challenging to present to the IT manager why I need more resources when the current business questions can be answered with less resources."
Senior Database Architect Wade Baskin recommends you consider the reasonable level of granularity that reflects your business situation. "In retail for example, does it make sense to track the transaction down to the individual product, checkout lane and price point, when all your business needs is the amount of a particular product sold, at a store, for a particular day?"