Continue in 2 seconds

The University of Texas M. D. Anderson Cancer Center Tracks End User Database Usage to Provide Improved Data Warehouse Structure and Performance

By
  • Craig Owen, Liz Paul
Published
  • September 01 2003, 1:00am EDT

CATEGORY: Database Performance & Systems Management

REVIEWERS: Craig Owen, informatics manager, and Liz Paul, systems analyst III, Department of Medical Informatics, University of Texas M.D. Anderson Cancer Center.

BACKGROUND: Celebrating six decades of "Making Cancer History," The University of Texas M. D. Anderson Cancer Center is located in Houston, Texas, on the campus of the Texas Medical Center. It is one of the world's most respected centers devoted exclusively to cancer patient care, research, education and prevention. M. D. Anderson ranks among the nation's top cancer hospitals in U.S. News & World Report's annual "America's Best Hospitals." In September 2000, the M. D. Anderson Cancer Center decided to develop a fully integrated patient-centered data warehouse. By consolidating patient data, M. D. Anderson researchers, physicians and analysts are able to analyze up to 60 years of patient information by disease, diagnosis and procedures and based on a multitude of demographic information.

PLATFORMS: Our data warehouse uses Oracle8.1.7 on an IBM AIX RS6000 server and is approximately 250GB in size. Users access the warehouse with a number of products including Brio Explorer, Brio Insight, Brio Quickview and Microsoft Access. Informatica is the primary ETL tool.

PROBLEM SOLVED: Although we were able to determine who was using our data warehouse and when, we did not have detailed information on which tables, columns and/or joins were being used. With the implementation of Usage Tracker, we are able to retrieve more detailed information about what is being accessed in addition to learning more about how the data warehouse is utilized. As our customer base continues to grow, we plan to leverage additional features that will allow us to monitor long-running queries, look for reports that are performed repeatedly and restructure our star schema collections.

PRODUCT FUNCTIONALITY: Ambeo has the ability to capture all SQL submitted to our database. However, we have used the "exclude" feature of the SQL Listener to prevent the capturing of SQL from our ETL application, specific schema owners and our technical team in order for Usage Tracker reports to reflect only the usage by our customers. We use Usage Tracker to identify our top customers, which tables are used the most and what queries are potentially problematic so we can provide feedback to our trainers who educate our customers on best practices.

STRENGTHS: One of Usage Tracker's main strengths is the ability to capture and store the queries along with the customer information. This allows us to tune our data model for maximum efficiency. In addition, the information that Ambeo collects is very detailed and includes metrics, such as all of the SQL text (select, delete, create table, etc.), the tables and columns accessed, database customer ID, network customer ID, applications used to submit queries and who uses those applications, performance of the queries (including fetch time, analyze time and total time) and the number of queries for each day.

WEAKNESSES: The Usage Tracker reporting feature is limited in scope. Although there are many standard reports, flexibility is limited in the parameter and filter criteria options. We have written several queries that join numerous tables in order to produce the type of reports we need to view.

SELECTION CRITERIA: It was important to us that there would be no overhead when collecting the SQL query data. We also liked the fact that the usage data is stored in a relational database. While many standard reports come with the product, we can create more reports specific to our needs. The most important feature was the ability to know who and what our customers were doing so we could improve our data warehouse star schema collections.

DELIVERABLES: The Ambeo usage tables are the primary information source for us to view how our data warehouse is being used. The data that is stored in these tables can be easily accessed through the Usage Tracker Client interface. In addition to writing some of our own custom queries, there are 15 standard reports that can be used to audit the use of the warehouse.

VENDOR SUPPORT: Ambeo support has been very responsive and friendly in helping us understand the product. Our maintenance contract also includes updates on a quarterly basis.

DOCUMENTATION: The documentation is easy to use. We installed and used the product based on the documentation provided. No formal training was necessary, which saved us time and money. Our primary interest in the manual was the section concerning the various usage tables and how they relate to one another, as well as the report definitions and usage section.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access