What historical insights does your data warehouse provide to users?
This column is the concluding portion of a two-part column on delivery of historical information in the data warehouse environment. This installment examines a method for designing the data warehouse model that can be employed to satisfy this challenging business requirement. In the first part of this series, we saw that using standard design techniques such as slowly changing dimensions (SCD) type 2 only solves a portion of the business need. SCD 2 does easily provide a method to relate all fact table records to the latest dimension key view. There are alternate design approaches that can be implemented with varying levels of effort that provide the type of flexibility needed for this type of information delivery.
Using the human resources head-count report example from part one of this series, the first approach requires the establishment of an effective dated organization tree table. This technique requires that the information delivery tool be able to provide the user the choice of which temporal perspective of the information he or she needs. Operational meta data assignment can be used during extract, transform and load (ETL) processing to distinguish current versus historical records in the dimension table. New organizational nodes or changes to relevant dimension table columns, such as description columns, will have their current flag indicator set to "Y." This requires additional ETL processing of previously loaded records for the same organizational node to have the current flag column reset to "N." Use of an operational meta data column provides a more efficient method for determining the latest organizational node view than examining effective dates.
User head-count reporting selection can be through an effective dated prompt for the report (depending on the information delivery mechanism) or by designing two types of the reports, a current or historical version. The relationship between the head-count fact table and the organizational effective tree table is established through the lowest node level on the tree such as a department ID (see Figure 1).
In order to report on head count for a current view perspective, only organizational table records with a current flag set to "Y" are used (or highest effective date for node). This can be accomplished through constraint selection or a table view. For a historical perspective of head count, both the organizational node and the fiscal period are used to report. In Figure 1, department 1A1 head count will rollup into one or three business units depending on the temporal perspective chosen by the end user at query time.
The amount of history maintained in the effective dated organizational tree table is determined by business requirements. Snapshots of the tree can be taken on a daily, weekly or, more commonly, monthly basis. The size of the organization, the amount of history stored and frequency of capture may require that you separate current and historical views of the information into two tables. Establishing a current and history table will also improve performance if users more frequently run current view perspective reports. This technique requires additional storage space and the creation of supplementary ETL processes for maintenance of the second tables.
ETL processing for the two tables requires differing processing methods. The history tables uses the SCD 2 technique for maintaining history on production keys and makes extensive use of operational meta data columns to perform processing. New records inserted into the history table due to new organizational nodes will also be inserted into the current table. This is done in order to maintain balance in surrogate key assignments between the two tables.
This technique goes further than SCD 2 because all head-count table records can be related to the most current values of a particular organizational node. Additionally, this method requires some level of maintenance to deal with corporate realties. As firms are reorganized and organizational entities are phased out, the organizational node must to be mapped to the latest organizational tree in order for the head count to be accurately represented in current view reporting. Without the mapping to the latest tree, a department dropped as the result of an organizational realignment would not have its head count denoted, leading to erroneous and inaccurate reporting.
The business requirement to present reporting information from a current or historical perspective can be accomplished through a combination various data modeling and ETL processing techniques. The method outlined offers a means of satisfying this requirement but requires additional processing steps. This method should only be used where the business requirement is absolutely needed to switch back and forth between current and historical views of information. Finally, the information delivery tools must be able to effectively access the table structures created for this method in a manner that is intuitive for the business user accessing the data warehouse.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access