MAY 11, 2006 1:00am ET

Related Links

Which CDC method is the best to achieve staging database with changed data?
March 7, 2008
Apart from bloated dimension, what are the negatives of using all known attributes in your SCD?
March 7, 2008
When is it better to have normalized data to create data marts and when is it better to have dimensional data?
March 7, 2008

Web Seminars

Data Replication for Real-time (Big) Data Warehousing
Available On Demand
Improving your Overall Analytical Environment by Migrating to a New Data Warehouse Platform
Available On Demand
The Dynamic Duo of Data Warehousing and Real-Time Streams
Available On Demand

Can we have data marts which have the same structure but at different locations?

Print
Reprints
Email

Q: Why can't we have data marts which have the same structure but at different locations? For example, consider a telecom provider which operated from multiple countries. Can't we build three data marts with same structure in the three countries? What will these data marts be called?

Joe Oates' Answer:

For a geographically dispersed BI capability, it is normal to have data marts located in different countries rather than pay for high-speed lines from those countries to the headquarters. It certainly helps to have the same structures for dimensions and facts in all locations. However, there is more to it and then merely having the same structures.

The headquarters location will surely wish to provide analyses and reports based on all of the "satellite" data marts and consolidate them into some sort of central data warehouse. One of the issues that will arise is that different countries will have different products or versions of the same product stored in dimension tables. The primary key values of rows for corresponding products will more than likely be different in each country's data mart.

So, even if all that is to be done is to store aggregate fact tables at the center location, it will still require ETL to bring data in from each of the data marts and store it in the central data warehouse. In effect, you'll need to treat each of the data marts as "legacy systems." Other issues include common products having different names due to local language, local sales currencies these that have to be translated into the headquarters accounting consolidation currencies, etc.

Adrienne Tannenbaum's Answer:

Same structure, different data populated is a simple distributed database architecture. Do they have any common values (such as country code)? Things to be dealt with include synchronization and access/security - should each location also have access to the other location's data - same types of access ... Will they all be sourced identically?

Usually one of the data marts is the master/parent/primary part of the overall distributed architecture and serves as the controlling start point in the overall data distribution scenario.

Tom Haughey's Answer:

There is no reason that you cannot have identical structures in multiple locations. Many warehouses do. As a case in point, the financial DW for a large computer manufacturer does exactly this. Figure 1 is an example of it.

Figure 1

The four regional data marts are identical in structure and in software. The only difference is the values each stores. The four identical data marts are, say, North America, South American, Europe-Middle East-Africa, and Americas Far East. In this case, each regional data mart contains only the financial transactions for that region. Each region in turn feeds data to the enterprise warehouse. From the enterprise warehouse, you can run enterprise queries or regional queries. From the regions, you can only run queries for that one region. Another case in point, a consumer products company has 12 business units. Each BU has an identical database, but only contains data for that BU.

Name them what you want. Perhaps, append the region name to the mart in each case if you cannot use the same name.

Joe Oates is an internationally known speaker, author and consultant on data warehousing. Oates has more than 30 years of experience in the successful management and technical development of business, real-time and data warehouse applications for industry and government clients. He has designed or helped design and implement more than 30 successful data warehouse projects.

Adrienne Tannenbaum is president of Database Design Solutions, Inc. (www.dbdsolutions.com), a New Jersey-based consulting firm specializing in the revitalization of corporate data. The firm focuses on data issues within large organizations and supports all data reconstruction efforts with a solid meta data backbone. Tannenbaum is the author of two popular meta data-focused books: Metadata Solutions: Using Metamodels, Repositories, XML, and Enterprise Portals to Generate Information on Demand (2001, Addison Wesley) and Implementing a Corporate Repository (1994, Wiley).

Tom Haughey is the president of InfoModel LLC, a training and consulting company specializing in data warehousing and data management. He has worked on dozens of database and data warehouse projects for more than two decades. Haughey was former CTO for Pepsi Bottling Group and director of enterprise data warehousing for PepsiCo. He may be reached at (201) 337-9094 or via e-mail at tom.haughey@InfoModelUSA.com.

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.