I recently spoke to a friend of mine who works as a business intelligence consultant about writing an article for Information Management with a new twist on the operational data store which I call DORDS. After a moment of silence, he said, A what? I said, A DORDS. Its a dimensional operational reporting data store. Then he asked, in his very low gravelly voice, You have my attention. So I proceeded to describe a recent project I had completed for a very small startup company with a unique business model.
Last year, I was contacted by a representative of this large company about my expertise on building data warehouses. For about 15 minutes, they described their business in general terms and their desire to create a data warehouse. I listened intently and took many mental notes. Taking a few minutes to gather my thoughts, I proceeded to describe several options. One option is to create an operational data store, which would alleviate the biggest problem of executing complex report queries against a transaction system. I proceeded to describe what they could do with this ODS and how they could subsequently create a real data warehouse as the main repository at a future date. We also discussed the transaction database issues, hardware configuration and their network.
So This is DORDS
DORDS, the dimensional operational reporting data store, or was born out of a typical modeling effort first to establish a quick win solution and satisfy the clients immediate reporting requirement. The decision was to incorporate a Kimball approach with a little bit of Claudia Imhoffs Corporate Information Factory strategy to design this business intelligence solution. In a flurry of activity over the course of about nine working days, different business process models were discussed, meetings were held, email sent and responded to and several conference calls scheduled. My client was understandably adamant about proceeding as quickly as possible with a delivery date of yesterday. As best I could, I explained that in order to birth a bouncing baby data warehouse and BI solution, there would need to be plenty of design sessions to conduct, define and map the enrollment process, perform data discovery and exploration of the data in their current transaction system, not to mention software evaluations and interviewing of prospective contractors before a deliverable date can be set.
The purpose of the ODS was to be a stepping stone, in accordance with the techniques of the CIF, followed by the development of an actual data warehouse. This final warehouse database was going to have a different design from the ODS. Instead of a dimensional database for reporting, the data warehouse would be constructed in the third normal form with some Boyce/Codd normalization in specific areas to satisfy the business requirement. It would be where the history was maintained.
The initial creation of the ODS was constructed to relieve the reporting issues. After go-live, the data would be pared down to accommodate perhaps between four and six months of data. The full complement of data would then be maintained in the data warehouses third normal form structures. Therefore, the initial ODS design was created during the first phase of the project, then a third normal form structured data warehouse would be created as the final database structure.
Several BI application tools were explored during the commercial off the shelf (COTS) search, however none offered the price-to-benefit ratio. Therefore, the client and I settled on one that was relatively inexpensive with a great maintenance plan, yet flexible enough to sit on top of a dimensional or third normal form database design. When combined with the dimensional ODS structures, the response time of data retrieval was subsecond. The corporate people, although they understood database structures, were ecstatic when they received their first report. The result set from the most complex report query returned hundreds of rows within a minute of execution against the database. The retrieval of data was so quick, the upper management staff initially thought the tool was broken. For most of the reports, the data appeared merely seconds after hitting the enter key.
Enrollments as a process turn into Facts after Dimensions
Because enrollments and memberships are much a part of the business during the interviews of the business and design workshops, we quickly discovered there were four main phases to the enrollment process. At each particular phase, business metrics were identified and defined. As it turned out, the first phase had two distinct parts, so in all actuality there were five parts to the four phases of enrollments.
Verifications serve as measurements in usage at each kiosk in predictive analytics. This verification process was used to properly identify and to verify each person using the kiosk as a member of the program. The business metrics surrounding verifications needed to count the number of members using the system. Through this type of analysis, the kiosk manager and upper management could reasonably predict the optimum time for a kiosk to open for business and how many kiosks and personnel would be needed at each location: in essence, labor management.
In the dimensional world, a DORDS is born. There are two main parts of the business: enrollments and membership.
As noted earlier, there are five parts to the enrollment inside four distinct phases.
Phase I: Online enrollments are conducted via a secure, Web-based application.
- Part 1: Consisted of collecting information about the potential member or enrollee.
- Part 2: Consisted of how this potential member or enrollee was to pay for the service.
The data that housed these two data points existed in two separate databases at two geographically different locations. This separation of data was perceived as more secure where a number and a date would be the actual join when necessary. If an Online enrollee met certain criteria, the enrollee would receive an email instructing him/her to proceed to the nearest In-Person Enrollment Kiosk with the appropriate paperwork. Otherwise, the enrollee would receive an email with a rejection notice. Also at this point, the credit card information, if not approved would be purged.
Phase II: In-Person Enrollments
This is where the enrollee would present his or her credentials, a biometrics collection would take place and security questions would be recorded. At the completion of the enrollment phase, the data would be collected and sent to the approving agency for vetting.
Phase III: Vetting
During this phase, the data would be reviewed by the approving agency. If and when it is approved, the credit card for the enrollee would get charged. At this point the enrollee actually becomes a member, and a record of approval would be sent back to the transaction database. During the extract, transformation and load processes, we would pick up the state change data and date and timestamp the enrollees status is updated.
Phase IV: Membership
A few days after the credit card charge, the appropriate membership card would be created and sent to the new member for immediate usage. The membership date starts on the approval date and when the credit card gets charged. However, there is a grace period where 10 to 15 days is allowed in the membership cycle to accommodate the lag in becoming a member and actually using the card for the privilege. Several business metrics are derived during this period where the number of days passed from the vetting approval date and the first time an enrollees card was mailed and when he or she used the card. This part of the business requirement was so complex we decided to create a separate dimension.
There were nine parts of the membership. Therefore, it required nine base business measurements. Due to confidentiality agreements, I cannot describe the different parts of the membership.
To satisfy the clients time constraints to create this ODS and have it operational in record time, we did not design or perform any transformations or change data capture routines. These transformations and change data capture routines were designed and implemented during the subsequent phases of the project. The ETL programmer designed the process so well that the initial data load took only 30 minutes. This gave us enough time to run several validation queries against the source and target systems that determined we had all the data prior to running the first of 10 operational reports.
After the initial operational capability (IOC) of the ODS, we were asked to create an intra-day data load ETL process. The modular design of the ETL process enabled us to segregate and create separate ETL processes to satisfy the intraday data loads from the online transaction database every three hours until the early evening. The main/full ETL processes were run after midnight to satisfy the stamp of when the data was loaded. This date stamp allowed for future enhancements to the ETL process after the initial go-live mainly to satisfy future requirements to store history and perform change data capture routines.
Although several BI tools were available, none offered a prescribed price to benefit ratio. The tool we chose was relatively inexpensive yet flexible. When combined with the dimensional ODS structures, the response time of data retrieval was subsecond. The corporate people, although they understood database structures, were ecstatic when they received their first report. The result set from the most complex report query didnt allow the staff to go get a cup of coffee and wait for the report to complete. The data appeared seconds after hitting the enter key.
Verification occurred any time a member processed through the kiosks. During this operation, a record of the start and stop times would be recorded at the kiosk then transmitted to the transactional database. The differences between these two timestamps were often negligible. However, the number and method of authenticating and verifying a member through the kiosk was the business measurement. Through this type of analysis, the kiosk manager and upper management can reasonably predict the optimum time for a kiosk to open as well as when and how many kiosks would be needed at each location.
Initial operational capability was achieved in three months and 19 days; follow-up work took another 10 months.
This was truly a dimensional operational reporting data store a hybrid model necessary to meet the requirements.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access