I have to model a transportation data mart. There are 3 levels of events, one is "Itinerary" and there are facts associated with that top level. The next level is "Routes" and that has different facts associated with it, but those facts roll-up to "Itinerary" and the last level is "Trip" which again has individual facts that rollup to both "Routes" and "Itineraries".

These facts are a mixture of numeric and datetime fields (I have to model it down to the second, so I can't use a date dimension - or can I? An added complexity is that I have an employee/role dimension that I don't know how to model. Each employee can have a role for a particular Itinerary and an associated time allowance, the rest of the facts are associated with each role equally. So the 3 questions are: How do I model the 3 levels of facts, how do I model the Time dimension, and how do I model the employee/role dimension? Thanks.


Adrienne Tannebaum's Answer: Take a few steps back and stop worrying about dimensions until you figure out the actual relationships between your "entities" or "subjects" or "potential dimensions". For example, you are questioning the way to model a "role" - this is clearly a logical modeling step. Can an employee have many itineraries? Can one itinerary apply to many employees? The role in this case breaks up a m any-to-many relationship. The same type of analysis needs to be done with "route" and its direct relationships with employee and itinerary. Once you do this type of work, your dimensional design will be much easier and much more flexible.

Joe Oates' Answer: Without knowing exactly what kind of business you are referring to (e.g., trucking, airline, bus, etc.) and not knowing exactly what is in the itinerary, routes and trip tables, I will have to make some assumptions.

I assume that an itinerary is a document that covers the entire trip listed point by point with departure times and arrival times and perhaps some other information. I assume that routes documents leaving one point and arriving at another. I assume that trip is equivalent to a single a segment or leg of the entire trip that documents the departure and arrival time and place, the distance in miles and some other measures like trip duration, number of minutes late, etc.

Though it may not seem logical, you should treat the Itinerary and routes as dimensions. The fact table is the trip table. Your date dimension should only go to the day level. Include a separate dimension and call it time of day, hour or whatever sounds appropriate to you.

As far as the employee roles are concerned, create three different database views of the employee dimension and use each of them as if they were separate dimensions. Each of these should have a primary key that is named differently from the primary key for employee. So instead of using Employee_ID as the key for all three views, name the primary key for the role the employee plays, e.g., Driver_ID, Consigner_ID, etc.

Ralph Kimball's book The Data Warehouse Toolkit (Second Edition) has a couple of examples that will help you. The figures on pages 231, 233 and 235 should give you the idea.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access