One of the most promising and exciting components of the data warehouse/corporate information factory environment is that of the exploration warehouse. The exploration warehouse takes in massive amounts of detailed data from the data warehouse and combines it with data from other sources (e.g., external sources). Once collected, the data in the exploration warehouse is integrated into a body of information that provides a foundation for deep statistical analysis. The data in the exploration warehouse is very granular and historical.

Business Patterns

The analyst perusing the exploration warehouse looks for previously undiscovered business patterns. Typical patterns that could be identified through exploration include:

  • A correlation between sales and weather (e.g., increased beer consumption and high temperatures).
  • The strength of peak-season retail sales and the items that are early indicators of that strength.
  • The factors of manufacturing production that correlate to a fall in yield.
  • A comparison of how corporate sales units over time correlate to industry sales of similar units over that same time period.

Once discovered, these business patterns can yield a competitive advantage through collaboration between the explorer and the business analyst.

Data Marts

Also residing in the corporate information factory is a similar, yet distinct architectural component –­ the data mart. Data marts are constructed on a departmental basis. Typical departments requiring data marts include finance, sales, marketing and accounting. Data marts traditionally run on online analytical processing (OLAP) multidimensional technology using a star schema or snowflake design. The structure of the data mart is designed to accommodate the needs and requirements of the individual departments. Data marts contain summarized and highly denormalized data.

KPIs

As a rule, it can be said that data marts track key performance indicators (KPIs) over time. The organization that owns the data mart determines what KPIs it requires, and those KPIs are calculated and tracked on a regular basis in the data mart. Typical KPIs tracked in a data mart are cash on hand, customer pipeline, sales movements, orders satisfied versus new orders, sales productivity and shelf life for selected SKUs.

The calculations needed to support the KPIs are often made on a monthly basis, although they can be made more (or less) frequently. In general, a great deal of processing is required to shape the detailed, integrated data found in the data warehouse into the form required by the multidimensional design of the data mart. Data must be merged, summarized, etc., before it is ready for multidimensional technology.

Once the data has been put into a multidimensional format, the data is able to be "drilled." A manager that has access to multidimensional data can drill to "explore" that data. If a KPI for the month reveals an unexpected value, the manager examine the reasons for the calculation.

If this type of analysis can be performed in a data mart, why do we need an exploration warehouse?

Drilling and Exploration

Indeed, the drill down performed in the data mart environment is a form of exploration, but there are fundamental differences between that process and the process of the professional statistician in the exploration warehouse. Some of those differences are:

  • When drilling into the data mart data, the manager can operate only on the highly structured KPI data that is found in the data mart. The professional statistician can operate on any data that can be put in the exploration warehouse, unfettered by structure.
  • When drilling into the data mart data, the manager is restricted to queries relating to and supported by the KPI data found in the multidimensional environment. The professional statistician is unrestricted when it comes to the formulation of queries.
  • When drilling into the data mart data, the manager usually looks at fairly short periods of time –­ current time and perhaps a corresponding value for the previous year. The professional statistician is interested in lengthy periods of time.
  • The drill-down analysis that occurs in the data mart is of an ongoing variety. The same KPIs are captured and analyzed over time. The analysis that occurs in the exploration warehouse is done on a project basis. One aspect of the business is studied in depth and from many different perspectives; and when the study is completed, the exploration warehouse is folded.
  • The queries submitted by the manager who is drilling-down are relatively short and well-defined. In contrast, the queries submitted by the professional statistician are anything but short and well- defined.

In fact, it is not unusual for a manager to start the analytical process by looking at multidimensional data found in the data mart, analyzing that data through drill-down processing as far as it can be taken, and then asking the professional statistician to continue the analysis in greater depth.
There are similarities between the analytical/exploration activities found in the data mart environment and the analytical activities that occur in the exploration warehouse environment, but there are many more differences. While exploration does occur in the data mart environment, the exploration that occurs is of a very cursory nature. Deeper and more sophisticated exploration occurs in the exploration warehouse environment.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access