The architecture of data warehousing has come a long way in a short amount of time. In the late 1980s ­ the early dawn of data warehousing ­ a data warehouse was merely a seperate structure for the execution of DSS processing. Today several different and important forms of warehousing are recognized. As the ad for Virginia Slims said ­ "Baby, we've come a long way." (If there is anything that will mark age and time, it is that saying from the sixties. Some of you recent college graduates will have no idea what that catch phrase is all about. My apologies.)

In today's world of data warehousing, some recognizable architectural constructs have arisen. There is the enterprise data warehouse, which is at the heart of the DSS environment. There is the operational data store, or ODS, which is a hybrid form of operational and DSS processing. There is the data mart, designed for departmental housing and processing of data. There is the exploration warehouse, which enables the out- of-the-box thinkers to have their own resources. And perhaps the newest kid on the block is the data mining warehouse, which is similar in concept to the exploration warehouse ­ with a few important exceptions. The data mining warehouse allows for the testing of hypotheses while the exploration warehouse is an infrastructure dedicated to the discovery of the hypotheses in the first place. Finally there is near line storage, which is used to house infrequently used and bulk data in the data warehouse.

There are other important aspects of the DSS environment, such as distributed meta data and multiple enterprise data warehouses. But the bare-bones architectural structures that make up the DSS environment are those that have been outlined here and shown in Figure 1. While there is a certain sameness to all of the structures, there are some important distinctions and peculiarities that make each architectural structure unique.

Figure 1: The Corporate Information Factory

For example, from a structure standpoint, the ODS and the enterprise data warehouse have many similarities. But the ODS contains only current data while the enterprise data warehouse contains a robust amount of historical data.

The data marts, at first glance, appear to be a subset of the enterprise data warehouse. But appearances are deceiving. The enterprise data warehouse represents corporate data while the data marts represent departmental data. The enterprise data warehouse contains historical data. The data mart does not. The enterprise data warehouse contains detailed data. The data mart contains a great deal of summary data. The enterprise data warehouse supports EVERY department. The data mart supports only one, and so forth. There are very significant differences between the data mart and the enterprise data warehouse.

The near line storage component contains data that is normalized and aims for flexibility of retrieval since many different functions will be served from this atomic level of data.

The exploration warehouse contains a subset of the enterprise data warehouse, but a subset taken only as of a moment in time. The enterprise data warehouse contains data that is constantly being loaded into the database ­ it is not a static set of data over time.

The data mining data warehouse contains flat file data that is optimal for analysis. The data mining data warehouse is deep, aimed for in- depth analysis of a few variables. The exploration warehouse is wide, aimed for intuition and observation.

There are then significant differences between the different architectural constructs that make up the DSS architecture. However, the architectural differences are nowhere more apparent than in the underlying structure of the data.

In order to understand just how different the underlying structures of data are, consider the normalized structure and the star join structure. The normalized structure mandates that data be represented in small tables which are related by a foreign key. There are many tables representing different aspects of the business in a normalized model. The star join is quite different because data is denormalized in accordance with the usage of the data and the number of occurrences of the data. In order to create a star join, a heavy emphasis is placed on gathering end-user requirements before the star join is created. There is no denying that a normalized structure is very different from a star join.

How exactly do the different types of data structures map to the DSS architecture? Figure 2 shows that there is a very clear mapping of normalized structures to the enterprise data warehouse, and there is a clear mapping of data marts to star joins. But what about the ODS? The ODS is peculiar in that it is a hybrid structure containing both elements of a normalized structure AND elements of a star join. For those parts of the ODS that are used regularly, a star join structure is optimal. For those parts of the ODS that are not used as part of regular processing, the ODS contains normalized data. Ironically, mixing both of the structural types isn't a problem.

However, most interesting are the exploration warehouse and the data mining warehouse. The exploration warehouse contains a series of normalized structures, each of which have been tokenized and fully indexed. The resulting structure is ideal for exploration. The data mining data warehouse contains perfectly flat data, the opposite of what is found in the exploration warehouse.

There is so much variety in the data warehouse environment when it comes to data structures. The data warehouse DSS environment is not a static, one-structure environment. Instead the data warehouse caters to different audiences and different users by offering a diversity of structures.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access