Continue in 2 seconds

I’m not sure if I’m going in the right direction with my data warehouse, data marts, cubes and business intelligence development.

By
  • Chuck Kelley, Clay Rehm
Published
  • October 09 2003, 1:00am EDT

Q:

I’m not sure if I’m going in the right direction with my data warehouse, data marts, cubes and business intelligence development. Our organization deals with healthcare planning so I have different databases dealing with census, mortality, inpatient/outpatient utilization and financial data. Each database has a main feed of data that is at the annual level for census, mortality and outpatient utilization, quarterly for financials and daily for inpatient utilization. I have created fact tables and dimensions for each database so the users can see all the components that are available for cross analyzing.

  • Should I be striving to have only one main database with all this data contained in one central location? The users have had studies that required uniquely built fact tables from all the databases. With the customization that is needed will I end up with one large mix of fact and dimension tables based on all the different studies?
  • Because census and mortality data is at an annual level and inpatient is at a daily with a rollup to annual, is it acceptable for analysts to export the data to Excel to show the relationships of the data? Or should I be building another fact table/cube to join the data - I don’t think Analysis Services supports multiple fact tables?
  • Does Analysis Services support the reuse of lookup tables among multiple dimensions if I placed all data in one database? I had problems creating a snowflake dimension. My State/County/Town table contains codes and descriptions. I received an error when trying to create a Patient_residence dimension and Facility_location dimension that utilized State/County/Town for its descriptions. I then coded the descriptions into the two dimension tables.
  • How do I know how much space these cubes/data are taking up on the Analysis Services, Database and the Proclarity side in order to determine if I am going to have capacity or performance problems as my data/users grow? Best regards.

A:

Chuck Kelley’s Answer: Wow, good questions, hope my answers are just half as thoughtful.

First bullet – I think the answer depends on your architecture and physical attributes. If you want them to be all in a single database, then do so. There are no major problems doing it this way. Regardless, I would create four (or more) Fact tables, one for each subject area (census, mortality, inpatient/outpatient utilization and financial data). Then, you can build whatever combinations and/or aggregates that you need that are needed to aid your user community.

Second bullet – Excel is fine, but I would most like create a combination fact table that could then be queried by the user community. You can use relational OLAP (ROLAP) or hybrid OLAP (HOLAP) – see documentation under Analysis Services and Flexible Data Mode for more information – for using multiple fact tables. If the data is on two or more different time aggregates, I would lean toward a new fact tables with the data rolled up.

Third bullet – You can by using ROLAP. I am not sure what you mean by snowflake in your example since it says State/County/Town table. That would lend itself more to star schema. Snowflake would have three tables – State, County and Town. So, assuming that you have a single table, create the fact table with two relationships between the fact and the State/County/Town table – one relationship for Patient_residence and one relationship for facility_location. Analysis services should run fine.

Fourth bullet – One day I will get my analysis services up and running again (I have not had time to work on it and it has been down for two to three months) and will respond to you directly via e-mail (unless one of the other experts answers it). At the moment, I cannot answer this bullet.

Clay Rehm’s Answer: Here are my responses to your questions:

  1. There must be common dimensions across the census, mortality, inpatient/outpatient utilization and financial data. I would encourage you to combine/merge this information so there is only one instance of these dimensions.
  2. There is no reason to have multiple databases even if the data is in different levels of granularity.
  3. I would build tables or cubes to make the work easier for your users.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access