Q:  

I have a question on best practices to be used in designing a data warehouse. Assuming the data warehouse requires both an ODS to integrate legacy systems and a DDS for analysis, where should we timestamp the data, in the ODS or the DDS. DDS granularity is at month level. Is it OK to timestamp (retire records in ODS using begin date and end date) in ODS and in DDS only give a monthly view of the records, e.g., Product dimension data attribute varies by day, but management just wants end-of-month attribute for analysis. Is it OK, if I use an ODS to store daily movements in product attribute and DDS to show the month-end level status or should I timestamp in DDS as well on a daily basis. I'm looking for a best practice approach.

A:  

Clay Rehm's Answer: No matter what the acronym is on the database or data store, a timestamp should be on stored and populated on every single record that is either inserted or updated. An ODS is the place to keep your detail in a more normalized fashion. This way, you can summarize it to any level of detail.

Les Barbusinski's Answer: The ODS and DDS serve different purposes and require different structures and content. The ODS contains current and historical snapshots of dimensions as well as "atomic" transactional information and, hence, needs to have all rows stamped with the actual transaction or snapshot date/time. The DDS, on the other hand, aggregates data at various levels in order to highlight patterns, and reveal trends. If the lowest level of "granularity" in the DDS is monthly, then the Time dimension need only include year, quarter and month ... and all relevant facts need only be tied to the month dimension. In this particular case, detailed timestamps in the DDS would be irrelevant ... adding cost with no discernable business benefit. Hope this helps.

Chuck Kelley's Answer: From your description, I would probably build an ODS with timestamps, build the data warehouse (DW) from the ODS at the daily level and then feed the DDS each month from the DW.

Joe Oates' Answer: In my opinion, the enterprise level data warehouse eventually provides the most value because all information is integrated in one place. However, building an enterprise data warehouse from scratch is a daunting task and is usually much more expensive than the original estimate.

On the other hand, once several individual data marts have been built, it is most often impractical to try to integrate them into a consolidated enterprise data warehouse without some careful planning and design. At a minimum, the most important dimensions that are anticipated to be used across all or most of the data marts must be designed to be common. Ralph Kimball calls this concept "conformed dimensions."

The more of these preplanned, common dimensions there are, the easier it will be to consolidate the data marts. Unfortunately, designing common dimensions across several business areas is not an easy job, particularly if your company has many heterogeneous products and several versions of the organization hierarchy.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access