Q:

How do you handle the dimension key in a fact table when a row in the fact table does not relate to one of the dimension tables? An example is dimension for group/prof org related to a members account. If the member belongs to a group, then the surrogate key of the group table is used in the fact table. If a member does not belong to any group, what should be used in the fact table. Which is more common – to create a "dummy" surrogate key which would state "no group" or use nulls in the fact table key for group?

A:

Les Barbusinski’s Answer: This is a common dilemma, and one usually overlooked by DW architects. There are actually two cases where the need for a "dummy" dimension key arises: 1) the fact row has no relationship to the dimension (as in your example), and 2) the dimension key cannot be derived from the source system data. Both of these cases are typically handled by creating a "dummy" row on the dimension table. The key for such a "dummy" row is set to an out-of-bounds value (such as a negative number like –1). All other attributes on the "dummy" row are set to nulls or default values (e.g., "?" for text fields, zero for numeric fields, 01/01/0001 for date fields, etc.). The negative key and default values provide visual cues on reports … and users quickly learn to translate such values to "unknown" or "not applicable."

Using nulls in the fact row’s dimensional foreign key is not a good idea. It can lead to spurious results in calculations and/or loss of data in queries (unless you explicitly specify outer joins in affected queries). Hope this helps.

Clay Rehm’s Answer: I like to have every value accounted for, so I create dimension values (rows) of "unknown" or n/a or something similar. This way, users do not have to worry about outer joins or about "null" values.

Scott Howard’s Answer: I do not recommend using a null because that can create problems for some BI outer join queries and in the case when a composite of the dimension keys is used in temporal queries or as the primary key (can’t contain a null). I recommend creating a valid dimension entry, in your case your "no group" value. This is a legitimate value so we can debate whether that constitutes a dummy key, but that really doesn’t matter.

Also consider that there may be more subtle meanings of "no group" not yet considered and not yet used. Legitimate meanings like "group not permitted" or "group not applicable" could also be possible and are very different than no group. Using a null to embrace all of the known and still yet unknown meanings could limit your solution.

This concept of creating legitimate keys with real business meaning can be applied to almost any dimension solving your problem permanently while avoiding the pitfalls associated with nulls.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access