If there are 10 data marts which use different schemas (some use star and the others use snowflake), will inconsistencies arise designing a data warehouse using a bottom up approach?


Les Barbusinsk’s Answer: Having different types of schemas in your data marts doesn’t – by itself – introduce inconsistencies. How you define and structure your data can, however. For example, including Product Class as an attribute in a consolidated Product dimension in one data mart, while breaking it off into a separate "snowflake" in another really makes no difference in terms of data consistency. However, if one mart defines a Product Class as having values "A" through "Z," while another defines Product Class as having values "A," "C," "F" and "Other" you may have problems equating Product-related metrics generated by the two marts.

Hope this helps.

Clay Rehm’s Answer: If the 10 data marts have been modeled without the knowledge of each other, then they most likely will be inconsistent with each other. The key to the bottom-up approach is making sure you have the enterprise in mind when constructing each data mart. This is a very difficult thing to do! Most people don’t even try this method because of the extra effort it takes to get coordination and cooperation across many business lines.

For example, the term "conformed" dimension means you actually took the time and energy to make sure the dimensions you created can and will be used by other data marts across the enterprise. The "customer" dimension should really only exist once and be shared among the data marts.

The key to remember is the same for almost every project – communication. Don’t wait for it to happen. Be an instigator! Communicate your project goals to anyone who will listen! Please don’t design another data mart that will become a data island.

Successful data warehouse/data mart implementations happen because the project team members have solid technical AND communication skills.

Joe Oates’ Answer: From the wording of your question, I am not sure whether the 10 data marts already exist or whether the 10 data marts will be built. The generally accepted definition of bottom-up data warehouse development is that a company starts building independent data marts and then somehow, someday, magically an enterprise data warehouse will suddenly appear. In either case, inconsistencies will cause a considerable problem when some unfortunate team tries to pull them together to create a central data warehouse.

The key problem is that it is difficult, if not impossible to identify "the truth." Because of the bottom-up approach, each stovepipe data mart will have its own version of the truth. Which version of the truth is correct? There will be data redundancy, some of which may go unnoticed due to different levels of granularity among the data marts. There is also the issue of synchronization, e.g., different data marts may be updated on different schedules, and so the concept of month or other time period may be different among the 10 data marts. Also, different data marts will have different data designs for common business concepts such as customers, products, etc. And then there are the hundreds of ETL programs that populate 10 data marts. Someone has to understand these.

The concept of implementing data marts progressively among the various business units is very attractive because the thorny and time-consuming problem of "harmonizing" design issues among the various business units or even between different companies in a conglomerate can be ignored. However, if there is to be a central data warehouse, a single version of the truth, these thorny and time-consuming problems will have to be tackled eventually. The more data marts that have been developed, the harder integrating them into a central data warehouse will be.

Chuck Kelley’s Answer: Just because the data marts are stars or snowflakes will not cause inconsistencies, but other issues can. For example, some of the data can be 1) summarized or calculated differently or 2) extracted on different days or times. All of these could hinder your ability to combine the different data marts into a data warehouse.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access