Continue in 2 seconds

Bottom-Up Warehouse Development

Published
  • February 01 1998, 1:00am EST

First there was the Easter bunny. Then the tooth fairy. Then there was the emperor who wore no clothes. Now we have bottom-up warehouse development, where the enterprise data warehouse is developed by first building a data mart.

The theory is that in bottom-up data warehouse development first one data mart is developed, then another data mart is developed, then one day--presto-- you magically and effortlessly wake up and have a data warehouse. While such a scenario appeals to the fantasy of those people who crave a quick fix without having to bother with the complexity, reality and investment that a real data warehouse requires, the truth is that bottom-up warehouse development is just a lot of wishful thinking. For a number of very good reasons, there is no substance behind bottom-up warehouse development.

To understand why bottom-up warehouse development is a fantasy, consider what happens when a corporation enters the data warehouse experience by attempting to build a data mart first. The corporation is undoubtedly happy when data mart number one is built. There is nothing to suggest that anything untoward is afoot at this point in time. The corporation quickly and eagerly builds data mart number two. There is general happiness all around until someone looks beyond the immediate boundaries of a given data mart and notices that there is significant overlap-- large redundancy of data--between the two data marts.

Soon data mart number three is on its way. For a brief moment when data mart number three appears, there is harmony. However, someone soon notices that all three data marts contain massive amounts of redundant data. The cost of each data mart repeating the capture and management of the same data is intolerable.

But massive redundancy of data among data marts is not the only problem. When management asks for an analysis to be done by each of the departments having a data mart, there is no consensus of opinion. One department says revenues have increased by 13 percent. Another department says that revenues are flat. And yet another department says that revenues fell by 8.5 percent. No one has a believable answer and, furthermore, no one knows how to get a believable answer. Management is perplexed. How do you make a decision when marketing is saying one thing, sales is saying something quite different and finance has yet another answer. At this point in time, the fallacy of building a bunch of data marts begins to show.

But soon another data mart is built. About this time, it is noticed that the first department to build a data mart has had to write 40 interface programs in order to get data from the legacy environment. Department two has had to write 55 separate and unique programs that move data from the legacy environment to its data mart. Department three has 38 programs that have been custom written in order to populate its data mart, and so forth. A small army of programmers is required to support the individual and customized interfaces required for the movement of data into each data mart. Every time a new data mart is built, the army grows by a few more soldiers. Not only are a large number of programmers kept busy but, at some point in time, the legacy application simply cannot support any more programs accessing data from the application. Furthermore, each interface program has its own unique interpretation of data so that there is no consistency whatsoever in the data that resides in the different data marts.

By creating and trying to maintain multiple data marts, the corporation is kidding itself if it thinks it has a data warehouse. All it has is a bunch of stovepipe DSS applications.

Why is it that the approach of building an enterprise, atomic, historical data warehouse simply cannot be achieved in a bottom-up fashion? There are many reasons why. The first reason is that an enterprise data warehouse necessarily entails integration. The enterprise data warehouse becomes a single point of truth at a granular level. Data warehousing requires that the atomic, corporate data be integrated. As such, the data in the enterprise data warehouse truly represents the corporation, not a department or a subdivision of the corporation.

And that's the biggest problem. No data mart developer ever builds integrated corporate data. Instead, each data mart developer builds data unique to the department that owns the data mart. If there is any integration at all, the integration is all within the confines and definitions of the department, not the corporation.

But integration is not the only reason why data mart data does not equal enterprise data warehouse data. Enterprise data warehouse data is fundamentally different from data mart data. Data mart data is optimal for the purposes of a single department, not the corporation as a whole. Data mart data has:

* A different level of granularity than corporate data warehouse data,

* A different key structure than corporate data warehouse data,

* Different attributes than corporate data warehouse data,

* Different sequencing of data than corporate data warehouse data,

* Different levels of summarization than corporate data warehouse data,

* Different amounts of history than corporate data warehouse data, and so forth.

There are many, many fundamental differences between the structure and content of corporate enterprise data warehouse data and data mart data. But differences between data structures and content are not the only reasons why a data mart does not magically and mystically turn into a data warehouse.

When a department builds a data mart, one of the appeals of the data mart is that the department can select its own hardware and software technology. After all, the department is paying for the data mart, so they can choose whatever technology they deem appropriate. If a department really were building a corporate data warehouse, they would have to select a very different set of hardware and software than they would if they were building their own data mart. For example, if you are building a data warehouse, you need to select hardware and software that is scalable. If you select anything less than fully scalable, industrial-strength hardware and software, you will find that as your environment grows at some point in time you will have to change technology vendors. And changing technology vendors in the middle of a project is something no one welcomes.

But scalability is only one issue. Since when did one department select hardware and software for the entire corporation? It is one thing for a department to select hardware and software that will fit into its budget and will satisfy its technological needs. But for a department to select hardware and software for its data mart and then mandate that that technology be used across the corporation is a fantasy. Corporate politics simply don't work that way. Yet that is what is implied by one department building a data mart and then trying to turn that data mart into a warehouse.

Another reason why it is a fantasy to think that a data mart can turn into a data warehouse is because of budget. To build a truly corporate structure requires that a significant budget be spent. The design of a corporate environment mandates that many considerations well beyond any given department must be made. But departments traditionally spend money for their needs (not some other department's needs). A normal department strongly questions why a nickel must be spent for anything but the department's immediate needs. Who is kidding whom? Marketing is not about to make a significant long-term expenditure for sales or finance when there is no requirement for marketing to do so. Yet the notion that a data mart turns into a data warehouse is predicated on such foolish assumptions.

Why then is all of this talk about building data marts before you build the data warehouse even an issue? It is an issue because some data mart vendors try to convince people that they are building an enterprise data warehouse when they are really building a data mart. Some data mart vendors look upon the corporate enterprise data warehouse as an obstacle to be bypassed on the way to making a software sale. Put another way: Does your data mart vendor care about your DSS architecture?

You are naive if you answer yes without looking at your overall enterprise strategy. Some of the data mart vendors cannot wait for you to do things right and build the corporate data warehouse first. Instead, these data mart vendors take all of the notions of architecture and jettison those concepts in the interest of a fast sale. When it comes to data mart vendors that tell you that a data mart can be magically turned into a corporate data warehouse, caveat emptor. Only in fairy tales can sows' ears be turned into silk purses.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access