Continue in 2 seconds

We are using a bottom-up approach in our project.

By
  • Chuck Kelley, Scott Howard
Published
  • November 05 2003, 1:00am EST

Q:

We are using a bottom-up approach in our project. My question has two parts:

a) How do I integrate two data marts that are in different SQL Servers at different locations for the same company, having the same business context but little variation in data mart structures, and what should be the data mart update/load strategy to follow?

b) Is it appropriate to load data from one data mart to another and make a single data mart having both locations’ data in one? If it’s appropriate, how do I transfer the data from one to the other? Can we use Data Transformation Services DTS provided by SQL Server or is there any other method? (The data marts have some common factors in their structures. How do I do it and what is the incremental strategy for each?)

A:

Scott Howard’s Answer: Is the horse pulling the cart or is the cart pulling the horse? I’ve recently heard a well-known respected consultant and pundit advocate combining several data marts into an enterprise data warehouse sourced from the existing data marts. I strongly disagree. Allow me to share with you something that I wrote for a reader last year. It’s very applicable to your question. Realize that your data structures may now be similar, but will they remain so? Is the meta data truly consistent? Read on and discover the flaws of independently updating and loading the data marts.

(DM Review, February 2002)
It's difficult to have your independent data marts feed the enterprise data warehouse (EDW), much like having the horse push the cart. This architecture can lead to inconsistencies between data marts that is only discovered when populating the EDW. You've got to turn your model around and strive for the enterprise data warehouse (EDW) feeding your data marts, not have the data marts support the EDW as you propose.

The EDW is the single consolidation point at which you materialize the single truth or consolidated view of your enterprise. This, in turn, will feed your data marts ensuring that only conformed and consolidated facts and dimensions make it to your data marts. Without this consolidation point your data marts will eventually drift apart and soon contradict each other, if they don't already. It's easy to get your business units to agree on standard business measurements such as sales volumes and revenue, but the dimensional data such as a standard definition of geography, customer or even time itself can be illusive. Imagine the confusion that results from calculating a standard weekly-sales-by-region across two data marts where the two data marts have conflicting definitions of geography and of time.

How can that happen? One data mart defines the week beginning Monday while the other begins on Saturday. One data mart defines region 1 as the Northern U.S. while the other defines region 1 as the Eastern U.S. This is not only possible but likely when you build independent data marts for separate operational units. Your task is to solve all these anomalies resulting from inconsistent meta data and dump the results into an EDW. Now once you accomplish it – the real source for all data marts is a consolidated EDW. That's the only way to ensure consistency for all future data marts and consistent expansion to the current. You'll also have to rearchitect your ETL flows to populate the EDW directly from the operational sources. Your data marts will now have to look toward the EDW as their consolidated source. It's a lot of work, but the benefit is the permanent consistent solution that you will construct for your business. The EDW should also contain the detailed data to support drill down, not represented in the normally aggregated data mart. This is almost impossible to materialize given the data mart feeding the EDW proposal.

Please also consider my April 2002 comments regarding MS SQL Server’s DTS. (DM Review, April 2002)
I don't think DTS was ever intended to be a full-featured ETL tool. You have accurately expressed its limitations thus also the reason for all the many successful ETL vendors that thrive today. Take a good look at those other ETL tools with your DTS shortcomings in mind as those are the major features important to you. Once a vendor starts to wow you with their many ancillary features, you may lose sight of why you need an ETL tool in the first place.

Chuck Kelley’s Answer: Congratulations. You have hit the problem with data mart consolidation on the head. I would change my transformations to build a data warehouse (it can be a dimensional model as well) and then feed the data marts. Having the data marts feed a consolidated data mart (data warehouse) seems backwards to me. In either case, DTS should work find, unless there are some really complex rules.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access