The inadequacies of a "data mart only" approach to data warehousing and DSS are well documented. For a while now, the notion that all you need for decision support is data marts has been discussed. Indeed, many corporations thought they were doing data warehousing when they built their first data mart. The "data mart only" approach is popular because it is cheap, fast and easy. The data mart vendors push the approach because they know if the corporation stops to build a real data warehouse that their quarterly revenue will be adversely impacted. But the "data mart only" approach has major pitfalls, most of which are not obvious until after the third or fourth data mart has been built.
What does a corporation discover after it has built the fourth or fifth data mart with no data warehouse as a foundation? Figure 1 shows what the corporate environment looks like when all you have is data marts.
Figure 1: The reasons why building only data marts is such a bad idea.
Figure 1 shows that the legacy environment directly supports the data mart environment. Each data mart extracts the same detailed data from the legacy environment. The interface to the legacy environment becomes a nightmare. There is massive redundancy of data among the different data marts, and each new data mart adds to the redundancy factor. The massive redundancy of detailed data makes the data marts much bigger and more expensive than they need to be. There is no reconcilability of data across the data marts. When asked about quarterly revenues, each data mart has its own interpretation, and the answers given by each data mart are unreconcilable.
Adding data marts only exaggerates the problems. Soon the organization wakes up and realizes that a "data mart only" approach to decision support and data warehousing is fool's gold. A bunch of data marts is simply not the answer to effective decision support or data warehousing.
Data Mart Meta Data
One of the novel approaches to solving the problems of multiple data marts is attacking the problem of lack of integration with data mart meta data. The theory is that if a corporation can use meta data to glue the data marts together, the data mart meta data can solve the inadequacies of the data mart only approach.
This approach using data mart meta data to address the inadequacies of the "data mart only" approach to data warehousing has both great promise and great pitfalls. Anything in life that has great promise and great pitfalls at the same time must be taken seriously.
Indeed, meta data management among the data marts is the first step toward rectifying the inadequacies of the "data mart only" approach to decision support. But, data mart meta data in and of itself does nothing for the corporation. In order to see why data mart meta data is not a remedy in and of itself, consider Figure 2 which shows that there is distributed meta data among the different data marts. This distributed meta data is good for meta object exchange between data marts. One data mart can ask another data mart what data resides in the mart, what is meant by one unit of data, or can share meta data with another data mart. All of which, if it leads somewhere else, can be a very good thing to do.
Data mart meta data by itself does nothing to solve the problems of data marts. Look at Figure 2.
Figure 2: Gluing the data mart together with meta data.
Does meta data among the data marts do anything to solve the problems of the legacy-to-data-mart interface? No. Does data mart meta data do anything to address the massive amount of redundant detailed data found at each data mart? No. Does data mart meta data address the reconcilability of information across the data marts? The best answer here is a qualified maybe. Data mart meta data may codify why there are differences among data marts. In that regard, data mart meta data might assist an analyst in understanding what the differences are between the different marts. But reconciling the differences is a long way from starting to codify the differences. The best that can be said here is that MAYBE data mart meta data starts the process of communication among analysts using the different data marts. But expecting data mart meta data to actually reconcile the differences between data marts is asking for a miracle.
Is It Useless?
So meta data for the data mart environment is not the right thing to do? No, that's not true either. Meta data for the data mart environment is the right thing to do if the distributed meta data leads to the proper architecture. In other words, meta data for data marts as an objective by itself is pretty pointless, but meta data for data marts as a basis for creating a proper architecture is a very good idea.
The architecture that a corporation needs to achieve is shown in Figure 3 which shows that a data warehouse that acts as the foundation for data marts. In other words, that is the architecture. In this architecture there is meta data for the data marts.
Figure 3: Decision Support Architecture
How then can an organization use data mart meta data as a basis for arriving at the proper architecture? Consider the state of architecture shown in Figure 1. When a corporation decides to go to a real data warehouse, creating data mart meta data allows a few things.
It allows a corporation to see what overlap there is among the data marts. Comparing data mart meta data allows the analyst to quickly see where the opportunities are for designing and populating the real data warehouse. Such an analysis is called a commonality analysis.
It also allows the analyst to see how each data mart compares to the corporate data model. With data mart meta data, the analyst can easily see what data is or is not in the data mart and, in doing so, is able to assess how much work is required for the building of the data warehouse.
There is good reason for building data mart meta data. Data mart meta data as an end unto itself is not a worthy objective. Data mart meta data as a means to an end, where the end is the architecture that should have been built in the first place is an extremely worthy objective.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access