- It represents a single, canonical state of the business (version of the truth). Too often, companies put data into a data warehouse and also store it in a plethora of other data stores. If a data warehouse must be match-merged with dependent data marts to provide needed information, then it is a potentially useful data store, but it is not active.
- It supports a mixed workload. The workload of an active data warehouse will typically consist of tactical inquires executing concurrently with complex business intelligence (BI) queries and trickle updates. If the DW is used only for operational queries such as customer transactions or product inventory, it is not active.
- Operational processing is driven by the DW. Active data warehouses do not exist in a vacuum. They exist in a processing loop. The "outbound" activity goes from the data warehouse to the operational system by means of automated system mechanisms including triggers, special purpose programming interfaces, a message broker and an extract, transform and load (ETL) tool - though the ETL tool is not often used for outbound activities. If the data warehouse doesn't deliver information automatically to operational systems, then it is not active. Manual intervention gets the job done, but the DW is not active.
- It represents a closed-loop process. In particular, the data warehouse is used to optimize processing in the upstream operational or transactional system. The operational systems feed the data warehouse which, in turn, feeds back to the operational system to optimize the relevant transactional processing. The interfaces go in both directions. The data warehouse provides operational intelligence and, as active, can properly be described as driving operational processing.
Active DW Joins Operational and BI in Both Directions
If a firm's data warehouse is trying to address these requirements, is still big and fat, but does not close the loop, then it is not active enough and is suffering from data warehousing obesity. It needs to become more active. That is not a trivial undertaking. A significant piece of design and implementation work will be needed to close the loop. Firms should approach active data warehousing with the following recommendations in mind.
Beware of marketing hype around the active DW. Many analytic customer relationship management (CRM) and business intelligence application vendors claim to be able to perform closed-loop processing. This is an incomplete claim. The closed-loop processing is performed, but only after custom design and implementation has created a unified view of the customers, products and sales, and designed and implemented the hooks back into the operational, enterprise resource planning (ERP) and legacy systems.
Rigorously define system interfaces. End users must carefully evaluate requirements in order to understand where a packaged analytic application ends and custom design and implementation begins. Obviously, any of the database vendors would argue that owning its database is a good start, and this may indeed be true. However, end-user clients will want to carefully analyze the interfaces between operational and decision support systems - in both directions - because that is the essence of active data warehousing.
Automate to activate. Make the data warehouse active by automation. When the decision support system generates operational initiatives, such as promoting an additional phone or line based on household usage (perhaps a teenager in the household might want his or her own phone), this goes on to active data warehousing. Another simple example would be an inventory system where the data warehouse becomes aware of an out-of-stock product and automatically generates a replenishment order to a supplier. Generally, this would require using such database mechanisms as triggers, stored procedures or an interface to a message broker, as well as customized application development. The loop has been closed back to the operational system from the decision support system. The data warehouse system is actively driving the operational one.
Lou Agosta is an independent industry analyst in data warehousing. A former industry analyst at Giga Information Group, Agosta has published extensively on industry trends in data warehousing, data mining and data quality. He can be reached at LAgosta@acm.org.