What is a degenerate dimension and how is it used?


Chuck Kelley’s Answer: A degenerate dimension is data that is dimensional in nature but stored in a fact table. For example, if you have a dimension that only has Order Number and Order Line Number, you would have a 1:1 relationship with the Fact table. Do you want to have two tables with a billion rows or one table with a billion rows. Therefore, this would be a degenerate dimension and Order Number and Order Line Number would be stored in the Fact table.

Joe Oates’ Answer: Many data warehouse transaction fact tables have a control number, such as an invoice number, purchase order number or policy number. Over the course of years reflected in the data warehouse, there could be hundreds of thousands or even millions of each particular control number such as invoice number. It isn’t practical to have a drop-down list of each of these numbers to find data typically found in the legacy system invoice header such as invoice number, invoice date, invoice type. Consequently, these data items are put into the line-item fact table to slice and dice as the users wish. These control numbers are called degenerate dimensions.

If you were to have a dimension table for invoice, you would have nearly as many entries in the dimension table as you have in the line-item fact table. The line-item fact table is generally the largest table by far in the data warehouse. So joining the multimillion or multibillion row fact table to a multimillion or multibillion row dimension table will cause your data warehouse to take up much more disk storage that it should as well as significantly degrading performance

Clay Rehm’s Answer: A degenerate dimension is a dimension that is stored in the fact table rather than the dimension table. It eliminates the need to join to a Dimension table.