JUN 19, 2008 4:54pm ET

Related Links

CIO Stepping Stones to Success
February 10, 2012
Birst Automates Connections to Big Data
February 8, 2012
Rising to the Enterprise App Demand?
February 8, 2012

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Enhance and Expand BI with Mobile
Available On Demand

Drill Down to Ask Why, Part 1

Print
Reprints
Email

Boiled down to its essence, the real purpose of a data warehouse is to be the perfect platform for decision-making. Most data warehouse (DW) and business intelligence (BI) architects accept this view, but how many stop and think carefully about what is decision-making, exactly? Every DW/BI architect can describe his or her technical architecture, but how many can describe the architecture of decision-making? If there even is an architecture of decision-making, how does the DW/BI system interact with its components, and what specific demands does decision-making place on the DW/BI system?

In 2002, Bill Schmarzo, a former member of the Kimball Group, proposed a very useful architecture for decision-making, which he called the analytic application process. According to Bill, an analytic application consists of five stages:

  1. Publish reports. Provide standard operational and managerial “report cards” on the current state of a business.
  2. Identify exceptions. Reveal the exceptional performance situations to focus attention
  3. Determine causal factors. Seek to understand the “why” or root causes behind the identified exceptions.
  4. Model alternatives. Provide a backdrop to evaluate different decision alternatives.
  5. Track actions. Evaluate the effectiveness of the recommended actions and feed the decisions back to both the operational systems and DW, against which stage one reporting will be conducted, thereby closing the loop.

I have found these analytic application stages to be very useful when I think about the architecture of a DW/BI system. Publishing reports (stage one) is the traditional legacy view of the data warehouse. We pump out reports and we stack them on the end users’ desks. There isn’t a lot of interactive BI in stage one! We also have been identifying exceptions (stage two) with thresholds, alerts and red/green blinking graphics for many years. At least in stage two, the choice of which alerts and thresholds we want on our desktops implies some judgment and involvement by the end user.

But it is in stage three, where we determine the causal factors behind the exceptions, that life really gets interesting. A good DW/BI system should let the decision-maker bring his or her full intellectual capital to bear on understanding what the system is bringing to our attention. This stage can be summarized by one all-important word: why.

Suppose that you work for an airline as a fare planner. In this role, a critical key performance indicator (KPI) is the yield, which according to Wikipedia is “the revenue or profits from a fixed, perishable resource such as airline seats or hotel room reservations. The challenge is to sell the right resources to the right customer at the right time for the right price.”

This morning, in your job as a fare planner, the DW/BI system produces a yield report (stage one) and highlights a number of airline routes for which the yield has dropped significantly (stage two). So, how does the DW/BI system support the all-important stage three? How does the DW/BI system support the fare planner when you ask, “Why are my yields down?”

Imagine five ways in which the fare planner might ask why. I’ll arrange these in order of increasing breadth and complexity:

  1. Give me more detail. Run the same yield report, but break down the high-level routes by dates, time of day, aircraft type, fare class and other attributes of the original yield calculation.
  2. Give me a comparison. Run the same yield report, but this time compare to a previous time period or to competitive yield data if it is available.
  3. Let me search for other factors. Jump to nonyield databases, such as a weather database, a holiday/special events database, a marketing promotions database or a competitive pricing database to see if any of these exogenous factors could have played a role.
  4. Tell me what explains the variance. Perform a data mining analysis, perhaps using decision trees, examining hundreds of marketplace conditions to see which of these conditions correlates most strongly with the drop in yield (explaining the variance in data mining terminology).
  5. Search the Web for information about the problem. Google or Yahoo! the Web for “airline yield 2008 versus 2007.”

Twenty years ago, when we drilled down in a data warehouse to ask why, we rarely provided more than the first capability above. I like to think of the longer list of all five capabilities as 2008’s definition of drilling down to ask why. Next month, I’ll explore how to implement these capabilities in a real DW/BI system.

Ralph Kimball is the founder of the Kimball Group and Kimball University where he has taught data warehouse design to more than 10,000 students. He is known for the best selling series of data warehouse "Toolkit" books. He started with a Ph.D. in man-machine systems from Stanford in 1973 and has spent the last 34 years designing systems for end users that are simple and fast. You can reach him at ralph@kimballgroup.com.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.