William would like to thank Mike Cross (firstname.lastname@example.org), data warehouse director at Rent-A-Center, for his contributions to this month's column.
Events and states are similar and related but need to be modeled and treated distinctly within your data warehouse. An event is simply a point-in-time occurrence and has only one time associated with it. This does not imply that an event can happen only once, rather, an event is a single occurrence. A state, on the other hand, represents something over a period of time and has a specific start and end time. States may have a null end time, but only one per subject. Examples of inventory events are delivery, return, to-service and from-service. Inventory states are on-rent, idle, on-loan and in-service. Events often trigger changes in states, e.g., a delivery begins the on-rent state, and if enough events are defined, maintaining the state of an inventory item is not necessary. Maintaining the state of events makes querying and reporting much easier and does not require the understanding of complicated business rules that associate events with states by the user. The business rules behind state changes can be implemented within the extract, transform and load (ETL) and with triggers, removing this burden from the report developers and user community.
When architecting your data warehouse, it is best to define and accommodate events and states as separate entities and not mix or imply one with the other. In addition, when gathering business intelligence requirements, it is vital that the user community differentiates states and events.
Every business organization has complicated business rules that drive reporting. These same criteria are often used in multiple places across the organization and are included in a myriad of reports. Consider the following simple business example. When reporting year-over-year "same stores," only include stores that: have been open for more than 18 months; have not been remodeled, enlarged or relocated and were not part of any test marketing. And, do not include data from any day that was a holiday or the store was open less than six hours. If this same criteria, which is often subject to change, is used by 50 different reports, ranging from revenue reporting to employee turnover, the maintenance burden becomes onerous and the potential for inconsistent results very likely.
Consider using data-driven criteria, which can be used to implement complex business rules that are repeated throughout your enterprise. Instead of replicating and maintaining business rules everywhere they are needed, it is easier to create a simple reference filter table used by all reporting that is maintained by a single ETL process. The filter table should consist of five fields (store identifier, filter identifier, start date, end date and an include/exclude flag) and will contain data filters as well as report filters. A data filter says whether or not to include data from this period for this store, and a report filter says whether or not to include this store on reports for this period. Rent-A-Center (RAC) uses a stored procedure at the end of every data warehouse refresh to refresh the filters. This stored procedure contains all of the business rules in a single, easily maintained place, and the logic is only coded once. When a report developer needs to develop a report for "same stores," I simply instruct him to use filter X for data and filter Y for the report. He does not need to know the business rules behind them unless he wants to. As the business rules change, the filter logic is updated and the reports automatically get updated without any code changes. Using a filter table is a very simple solution to a complicated business problem.
In addition to using a filter table at RAC, we often encounter business requirements that must filter data based upon dimension values. For these requirements, we add flag columns to the dimension tables to indicate inclusion/exclusion for certain categories. Consider, for example, which revenue categories should be included in the nebulous summation "total revenue." RAC's revenue type dimension table simply has an attribute entitled "total revenue" that contains a yes/no flag. Report developers simply need to reference this attribute to determine which revenue values should be reported in total revenue and do not need to maintain a lengthy list of surrogate keys or business rules everywhere the value is needed. Again, an ETL process maintains the attribute, contains all the business rules and reports problems when new revenue categories appear that it does not know how to address.
Event-state management and data-driven criteria are common business problems that can easily be addressed within your data warehouse instead of in reporting logic to provide consistent, reliable results to the business community.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access