Continue in 2 seconds

Our client already has many data marts without a data warehouse.

By
  • Douglas Hackney, Sid Adelman, Les Barbusinski
Published
  • March 13 2002, 1:00am EST

Q:

Our client already has many data marts without a data warehouse. I have to build the enterprise data warehouse, but I don't have any idea how can I manage the existing data marts. Can I use the data marts for the enterprise data warehouse? How can I use the data marts to support and enterprise data warehouse?

A:

Sid Adelman’s Answer: Consider a federated approach that will give you the capabilities of an enterprise data warehouse while maintaining most of what is in place with your data marts. Search on “Federated Data Warehouse” and on “Colin White.” He has written extensively on this subject. Les Barbusinski’s Answer: In general, data warehouses support data marts…not the other way around. Data marts are usually created with a built-in organizational or functional bias (i.e., they are designed to generate a particular set of metrics from a specific business perspective). A data warehouse, on the other hand, is usually designed with an enterprise perspective. Using independent – or even federated – data marts to populate a data warehouse risks slanting the warehouse’s view of the enterprise to that of a given department or business unit … thereby limiting its usefulness and almost certainly killing its acceptance by the rest of the enterprise. Don’t try to “manage” the data marts. Try, instead, to support them with the new data warehouse. Remember that a data mart almost always enjoys more support among business management and end users than does a data warehouse.

Study the dimensions and hierarchies in all of the data marts and try to design a standardized and flexible set of “master” tables (e.g., customers, products, vendors, organizations, accounts, geographic locations, etc.) that can easily feed all of the marts and thereby achieve some processing economies. Similarly, study the fact tables in the data marts and design a standardized and flexible set of transactional tables (orders, trades, deposits, shipments, clickstream data, etc.) that can feed all of the data marts. Next, explore ways that the data warehouse can store historical and/or “snapshot” data – as well as expensive and/or difficult to obtain external data – that could be supplied to the marts on demand. Marts rarely save more data than they need in the short term and reducing the lead time it takes to populate a mart with a new dimension or fact can win acceptance for the data warehouse in a hurry. In short, try to make the data warehouse the preferred source of cleansed and integrated information for the data marts.

Next, extend the data warehouse to provide innovative services that the data marts can “subscribe” to such as a) populating facts and dimensions from the data warehouse through Oracle materialized views, b) providing real-time data feeds for selected data elements via EAI message broker technology, or c) providing LDAP-based enterprise meta data that is easily accessible by the data marts’ Web- based OLAP tools, etc.

Doug Hackney’s Answer: You may be able to salvage some things of value from the data marts. If they are incremental architected data marts, then you can simply consolidate them into the EDW. If they were built in a non-architected manner, after you utter curses upon the teams that built them, you will be left with trying to get what you can from their efforts.

Look to get what value you can from:

  • Business requirements analysis work
  • Source system analysis work
  • Standard business rule ETL modules
  • End-user access and analysis environments

You will be left with the challenging job of integrating non-architected data marts into the overall EDW system architecture. I cover this task in my book, Understanding and Implementing Successful Data Marts. The full range of required activities is beyond the scope of this forum.

Scott Howard’s Answer: It's difficult to have your independent data marts feed the enterprise data warehouse, much like having the horse push the cart. This architecture can lead to inconsistencies between data marts that is only discovered when populating the enterprise data warehouse (EDW). You've got to turn your model around and strive for the EDW feeding your data marts, not have the data marts support the EDW as you propose.

The EDW is the single consolidation point at which you materialize the single truth or consolidated view of your enterprise. This in turn will feed your data marts ensuring that only conformed and consolidated facts and dimensions make it to your data marts. Without this consolidation point your data marts will eventually drift apart and soon contradict each other, if they don't already. It's easy to get your business units to agree on standard business measurements such as sales volumes and revenue, but the dimensional data such as a standard definition of geography, the customer, or even of time itself can be illusive. Imagine the confusion that results from calculating a standard weekly sales by region across two data marts where the two data marts have conflicting definitions of geography and of time.

How can that happen? One data mart defines the week beginning Monday while the other begins on Saturday. One data mart defines region 1 as the Northern U.S. while the other defines region 1 as the Eastern U.S. This is not only possible, but likely when you build independent data marts for separate operational units. Your task is to solve all these anomalies resulting from inconsistent meta data and dump the results into an EDW. Now once you accomplish that, it's important to establish the real source for all data marts as that now consolidated EDW. That's the only way to ensure consistency for all future data marts and consistent expansion to the current. You'll also have to rearchitect your ETL flows to populate the EDW directly from the operational sources. Your data marts will now have to look toward the EDW as their consolidated source. It's a lot of work, but the benefit is the permanent consistent solution that you will construct for your business. The EDW should also contain the detailed data to support drill down, not represented in the normally aggregated data mart. This is almost impossible to materialize given the data mart feeding the EDW proposal.

Chuck Kelley’s Answer: You can use the data marts as the enterprise data warehouse IF AND ONLY IF, the dimensions within each of the data marts are the same (or extremely similar – meaning that one may be more granular than another, but must contain exactly the same type of data). The IF in this case if pretty hard to do. If there are not strong controls over the dimensions during the building of the data marts, all your client has done is to build stovepipe or silo data marts, which is what the data warehouse was supposed to have solved for them. I think that what you need to do is the start building the data warehouse for them as if they have done nothing. The good thing is that you have an idea of the types of questions that are being asked (from the current data marts).

David Marco’s Answer: The data marts are cannot be used as the enterprise data warehouse. I wrote a two-part article on this subject in DM Review called "Independent Data Marts: Being Stranded on Islands of Data (Part 1)” and "Independent" Data Marts: Being Stranded on Islands of Data (Part 2).” Go to www.EWSolutions.com and click on the “Research Center.”

Clay Rehm’s Answer: My first thought is to hire an experienced data warehouse consultant to assess the specific situation and provide alternatives. Even with the advice you receive in this column, it is difficult to know all the problems you will face when building the enterprise data warehouse.

You did not state why you need to build the enterprise data warehouse, so I assume that there is a good reason and you are pursuing to solve a specific problem. If this is not the case, please stop what you are doing and create a project charter and scope statement that identifies the problem and how your enterprise DW will solve it.

You may need to build your data warehouse using the federated approach, which means your databases will be distributed across the enterprise but will look like they are in one place from the users point of view. One challenge will be creating an enterprise meta data repository and getting all of the project stakeholders across the organization to agree on semantics and definitions of terms.

In the meantime, be proactive and interview each and every stakeholder. Document each of their requirements and objectives, and share these with your team.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access