Exploit Your Fact Tables
Clean fact table designs have application in the front room and back room
Information Management Magazine, Jan/Feb 2009
My previous column described the three types of fact tables that are all you will ever need in your data warehouse (see December 2008 DM Review). The secret to this simple observation is adhering fanatically to the grain. A fact table records measurement events, and as long as we only record one kind of measurement event in a given fact table, we only need the three basic types: transaction grain, periodic snapshot grain and accumulating snapshot grain. This month, I'll describe basic ways to exploit these clean fact table designs in the front room and in the back room.
Front-Room Aggregate Navigation
Advertisement
Front-Room Drilling Across Different Grains
If you are good at visualizing dimensional schemas, you will understand that you can drill across multiple fact tables at different grains as long as you choose conformed dimension attributes for the answer set row headers that exist for all the fact tables in your integrated query. For example, if you have a sales fact table in the grocery store whose grain is every individual transaction, and you also have a monthly brand forecast table whose grain is product brand by month, you can perform a drill-across query on these two tables as long as all the dimensional references in the SELECT list and in the query constraints refer only to brands and months. The process of sort-merging the answer sets coming back from these two fact tables was described in my recent column on the architecture of integration: "Fact Tables."
Front-Room Exporting Constraints to a Different Business Process
If you have a sophisticated data warehouse, you may be able to support the capabilities in the previous two sections. But there's one absolute bedrock of decision support I bet you can't do. Suppose you have drilled down in the grocery store database and found a product brand that seems to be selling poorly in certain regions. Your instinct is aroused, and you ask, "What were the merchandising deals we did with that manufacturer in those regions and in the time period I am concerned with?" In most data warehouses, you would have to scribble down the brand, region and time period information on a piece of paper, close the current application, open another user interface and reenter this information. But why can't you simply select one or more offending rows in the first application and copy/paste them with a single user interface gesture into the second application? The context of the selection would apply the appropriate constraints directly. I know this is possible because I have built query tools that do this.
Back-Room Fact Table Surrogate Keys
Surrogate keys are a staple of dimension table design, but surprisingly, there are times when we want a surrogate key in a fact table. Remember that a surrogate key is just a simple integer key, assigned in sequence as records are created. Although fact table surrogate keys (FSKs) have no business meaning in the front room, they have a number of uses in the back room:
- FSKs uniquely and immediately identify single fact records.
- Because FSKs are assigned sequentially, a load job inserting new records will have FSKs in a contiguous range.
- An FSK allows updates to be replaced by insert-deletes.
- Finally, an FSK can become a foreign key in a fact table at a lower grain.
Ralph Kimball is the founder of the Kimball Group and Kimball University where he has taught data warehouse design to more than 10,000 students. He is known for the best selling series of data warehouse "Toolkit" books. He started with a Ph.D. in man-machine systems from Stanford in 1973 and has spent the last 34 years designing systems for end users that are simple and fast. You can reach him at ralph@kimballgroup.com.
For more information on related topics, visit the following channels:






