A significant amount of interest has been expressed in the press and over the Internet recently concerning the architectural construct of the operational data store (ODS). It seems that all of the publications agree that the ODS database concept is necessary to support the tactical decision-making processes of all enterprises. Yet, a few of the authors have implied that the ODS is part of a data warehouse (DW). We have found, from years of systems implementations, that it is impractical to combine the data structures of an ODS with that of a DW. The objective of this article is to present what an ODS is and why it is a separate database construct from the DW. The DW and the ODS serve as part of an overall architecture, the Corporate Information Factory1 (CIF). The data warehouse gives us a time-trended, historical perspective of the information required for strategic decision support. Using the data warehouse, we look for long-term patterns and trends, such as buying habits of consumers in certain regions of the country. To support that analysis, we may store data in the DW based upon hourly time increments, by day, by order, by SKU, by store. There is no question that a DW must contain very detailed data and maintain that data over a long period of time. It is the level of detail and time span of the data that make the DW database scheme impractical to use for tactical decision support.
Tactical decision support functions require more than a lower level of detail data. The frequency of update to the data is required at least daily, and many require near real-time to real-time updates. This high frequency of data updates is required for tactical business and technical management functions such as:
- Customer Contact
- Repair/Support Management
- Health Care Case Management
- Campaign Management
- Commercial Lending
- Mortgage Lending
- Inventory/Stock Management
- Fraud Detection
- Operational Data Integration
There have been three types of operational data stores defined in the book, Building the Operational Data store.2 These three types are primarily based upon the frequency of update and have been named Class I, Class II and Class III.
A Class I ODS has instantaneous (or near) updates from the application systems. Due to the need for this type of ODS to be very fast at updating, application exits or database triggers will be required to complete the extraction and transformation to the ODS environment. The complexity and synchronization required for this environment does not allow for the integration of data from many source systems. It is the frequency of update that limits our technical ability to integrate data. Generally, in a Class I ODS we can only consolidate a small number of data sources into one database. A Class I ODS is expensive to build due to the technology requirements for managing real-time updates. Many of the newer operational application packages, such as SAP and PeopleSoft, are being built with these ODS-type schemes.
A Class II ODS has near current updates from the source systems. These updates happen every one to two hours. The Class II ODS allows us a bit more flexibility in the amount of integration that can be completed during the update time frame. A Class II ODS can still be expensive based on the type of source systems that are being integrated.
A Class III, the more popular ODS, is updated once a day. This reduced update frequency provides us with a great opportunity to include multiple source systems for integration and the time to perform data cleansing and transformation. The reduced frequency allows us time to rebuild indexes, as well as correct errors or restart update jobs that may have failed. By far, the Class III ODS is the easiest and cheapest to build.
Why Build an ODS?
We are frequently asked very similar questions, "How do I know that I need to build an ODS? What problems exist that can be solved by building an ODS?" We have found that the reasons to build an ODS can be generalized into the following categories:
New operational applications that make use of data integrated from multiple business systems to conduct corporate-wide operational business functions. This generally requires the implementation of a Class I or Class II ODS. Examples of these integrated operational business functions are customer management or health care case management. This is a business- driven need for the ODS technology and is generally funded by a business organization. New applications generally need to be written to make use of the corporate-wide view of the data integrated into the ODS.
Back-office, tactical reporting that requires the integration of data each day for the nightly batch reporting. This generally requires the implementation of a Class III ODS. Organizations that are reengineering their daily batch reporting to make use of an ODS scheme fall into this category. Both the business units and technology organization fund implementations in this category. While additional functionality will be provided, these projects are generally viewed as cost-cutting projects.
The existing method of sharing data between operational systems through batch extracts has created a burdensome processing load in addition to a data quality nightmare. Many technology groups are facing this issue. A Class II or Class III ODS can be used to consolidate and integrate operational data limiting the data extract processes. Projects in this category are generally viewed as technology cost-cutting projects. They most often do not provide additional business functionality, and thus are funded fully by the technology organizations.
One of the biggest advantages a company can gain from an operational data store is the integration of operational data. An ODS serves three purposes during system integration:
1. Integration of operational data from many disparate systems to create a corporate view. This approach allows for integration of all pertinent corporate information as well as sets up an architecture that can be used to reengineer legacy systems very slowly over time, if that is the direction in which the company wants to proceed. This can be very expensive to build due to the complexity of the interfaces that must be built, redundant purchase of hardware and software, and the personnel required to take on such an endeavor.
2. Data shareability for new applications goes along the lines of number 1, except that data is not as tightly integrated. There may actually be multiple entities with customer information. All data is consolidated onto one platform to be shared with other applications. This environment can become difficult to maintain if a large number of source systems is included. It can also be created in a short amount of time. As this environment evolves, integration will become necessary.
3. Accessibility to data for reporting. This approach integrates data for reporting and sharing based on subject areas. Unlike number 2, it is integrated into subject areas, but we may still have multiple entities with customer information. It removes from the operational legacy systems the headaches of ad hoc reporting during business hours or convoluting precious batch windows. Many times companies will have too many extracts from source systems that become unwieldy to control. Many extracts are also costly to maintain. By integrating the data for sharing, we can save costly extracts.
Industries that we typically see making extensive use of an ODS are insurance, airlines, medical, telecommunications, cable TV and manufacturing industries. To aid the reader in understanding how to make use of the ODS technology for their organization, the following are examples of an ODS for an insurance, retail and a cable company.
Typical insurance subject areas are: claim, policy, provider, agent, payments and litigation This information is usually in many different systems and may actually be in regional or district offices. Therefore, there is no corporate view of the data. It could take weeks for a programmer to pull together a report worthy of management containing information about benefits paid on a specific claim on a specific policy.
By integrating this information in a Class III operational data store, we can save the programming staff a tremendous amount of time as well as set up for a snapshot of certain information into the data warehouse on a monthly basis. The data warehouse would use information such as amount of claims paid, number of claims to litigation, by policy type, by month, etc.
Placement of the data on the disks is going to be very important in an ODS that creates some roll ups. For instance, in our insurance scenario, we may keep an ongoing tally of the amount of claims that have been paid by policy number. Drill down may also be required in an ODS to the exact claim or payment information. Therefore, the model must reflect the drill-down requirement. Some customers "hoe the same rows" all the time and may not need this type of sophistication. If it is required, the architecture of this environment must be planned and implemented in a precautionary fashion. The access patterns of the customer must be gathered and planned upfront. An ODS with dynamically summarized tables will require more monitoring and tuning than an ODS with no summarized tables.
Indexing is not easy to determine without knowing the access patterns. The indexes must be built to give as much coverage without degrading the system during updates. It is recommended to segregate, on disk, summarized tables from detail data as much as possible.
Another example would be campaign and promotion management for a retail company. Data can be purchased from an outside provider (i.e., credit demographics). Promotion history can be used from the data warehouse and customer information integrated from the operational systems. This wealth of information integrated into an ODS provides the advantage and knowledge to manage campaigns.
Cable TV ODS
Another scenario is cable or satellite TV. Customer viewing and purchasing information may be collected during the day. At the end of the day, this information is tallied by customer and brought to the operational data store. This information is used to target specific customers for marketing campaigns. Information from the marketing staff can be input into the ODS in an application built specifically for them. Some of this information will eventually be brought into the data warehouse for analysis as well.
In summary, an operational data store is appropriate when day-to-day management of our business is required from an integrated or corporate view of our data. This will probably include data sharing and reporting. An ODS can feed pertinent data to the data warehouse on a time-sensitive basis. The ODS is a vital part of the Corporate Information Factory, but may not be for everyone!
1 Imhoff, Claudia. Inmon, W.H. Sousa, Ryan. The Corporate Information Factory, Wiley Computer Publishing, 1998.
2 Battas, Greg. Imhoff, Claudia. Inmon, W.H. Building the Operational Data Store, John Wiley & Sons, 1994.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access