In an effort to organize and coordinate information vital to the decision-making process, many government IT managers are implementing data warehouses. Often, they are struggling with islands of information around their organization and want to move to a centralized data warehouse that speeds access to critical information. Data warehousing solutions offer a way to unify disparate systems with one consistent version of data, transforming raw data into useful business information for strategic decision-making. When the Department of Defense, Health Affairs decided to transition from a fee-for-service to a managed care form of health care delivery for its 8.5 million beneficiaries, its IS team determined that it needed a corporate executive information system (CEIS) and an enterprise data warehouse. A crucial part of their challenge would be finding the best way to move data from multiple existing systems, perform the complex transformations required to integrate the data and get it all done under a tight operational window.

DoD Health Affairs is a global network of 225 Army, Navy and Air Force hospitals, 400 clinics and numerous outpatient facilities. Five major source data collection systems (SDCS) located in twelve geographic regions have been identified as data sources.

The DoD turned to EDS to help them create an enterprise data warehouse that would collect data from the local systems, perform business rules-based transformations and load a data warehouse which could be queried for analysis and reporting purposes. With 800GB of historical raw data and thereafter 160GB of data per month to be moved from the local systems into the data warehouse, the team needed an industrial-strength extraction/transformation tool to perform all of the movement and transformation of data.

Extraction and transformation tools (ETTs) perform one of the most difficult steps in the data warehouse creation process--the extraction, transformation and aggregation of data from multiple sources. This step is particularly challenging to organizations that can no longer fall back on armies of COBOL programmers to write code for such routines.

ETTs can move data from disparate, even heterogeneous sources, to a data warehouse and perform complex transformations and aggregations. A good ETT takes the burden off the IT department, automating processes which otherwise make more work for in-house programmers, performing change data capture for quick updates and allowing for custom routines that are reusable and portable.

As the team evaluated ETTs from many vendors, they focused on four key criteria: visual processing--they wanted a tool that would let them layout and arrange their capture and movement of meta data; ability to configure and manage GUI generated routines; performance of the tool--they needed something that would process big loads fast and allow them to design customized job data flows; and price performance.

After four months of evaluating and testing, EDS selected Ardent (formerly VMARK) DataStage for purchase by the DoD. Running DataStage, the transformation servers now take all source data collection systems, including separate ambulatory, inpatient, ancillary, expense and purchased care information sources, integrate them and send the data to the enterprise data warehouse.

The DoD data warehousing solution exploits many aspects of the ETT, including its inherent database, custom routine builders and ability to perform complex transformations. The IS team has even expanded this transformation process and is now planning to use the tool at the regional level to pull data from the hospitals and populate the local systems. The transformation servers will eventually send data to the enterprise data warehouse as well as some regionally created data marts.

After working with EDS and other vendors for a year and a half, DoD, Health Affairs now has a prototype data warehouse that integrates data from all the SDCS. The prototype consists of Informix Online Dynamic Server and Informix MetaCube running Digital UNIX on a Digital Alpha 4100. The transformation servers are four processor Compaq Proliant 6000s running Windows NT. The extraction and transformation tool is Ardent's DataStage. BusinessObjects, SPSS and an EDS-developed tool, MetaBase, are being used for reporting and analysis.

The Corporate Executive Information System (CEIS) enterprise data warehouse is in the final stages of development and should be complete in June of 1998. Upon completion, the operational system will manage more than 2.2 terabytes of data.

"Using the DataStage extraction/transformation tool is not only a cost saving measure, but significantly reduces time to deployment," said Jeffrey Lopata, director of deployment and operations for CEIS. "Its ability to move the data from disparate sources, ease of design and the availability of on-site consulting support have all been key in helping the IS team meet its objectives."

EDS has been under a tight deadline to provide a proof of concept to the CEIS program management office and had to develop a fairly sizable subset of the total solution. Selecting the right ETT tool has been key to meeting this and other deadlines.

As the Department of Defense, Health Affairs and other health care providers transition to managed care, they, like other IT users, are undertaking the challenge of putting together enterprise architectures for decision support. ETTs play an important role in this process, executing what is often the hardest part of building a data warehouse--extracting the data from disparate and often heterogeneous systems, performing complex transformations and loading it into an enterprise data warehouse. The right tools can make the difference between a very arduous process and a manageable one. For this reason, extraction and transformation tools that are easy to use, able to handle complex transformations and support rapid deployment address a critical need.

Practical Advice

1) Never assume that your current applications are collecting all of the data that you will need in your data warehouse/data mart.

2) Always keep an open mind to the various types of data that are available from third-party sources that can be added. (Example: weather, stock prices, competitor information, etc.)

3) Take advantage of your database vendors when designing and developing your warehouse/mart. Many times they will be able to help you avoid problems that their other customers have run into.

4) Do not underestimate the time required to develop the data extraction/population piece of the project. Most data warehousing/mart projects that fail are due to this. Using an extraction/transformation tool will shorten development time and make it possible to accurately estimate your time requirements.

5) Remember that building a successful data warehouse/mart requires input from people in many different departments.

6) The three most important things to look for when selecting tools for warehouse/mart development, maintenance and reporting are: a solid yet flexible product that will meet changing needs, a stable vendor with strong support capabilities and a solid track record for the vendor and the tool.

7) Set clear expectations to management as to time and funding requirements. A project that is completed before the expected time, under budget and demonstrates a ROI will always be a success.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access