MAY 1, 2007 1:00am ET

Related Links

10 Sustainability Predictions for 2011
February 23, 2011
A Letter to Future Employees: Embrace Analytics
February 3, 2011
A Hunger for Risk
January 6, 2011

Web Seminars

How to Narrow the IT/Business Communication Gap
March 21, 2012
Data Modeling Made Simple with Steve Hoberman
Available On Demand
Go Big Data or Go Home
Available On Demand

Super Star

Print
Reprints
Email

A star schema is designed to efficiently answer metric business questions, such as those questions that return amounts, quantities and counts. For example, "What are our total sales by customer and product over the last six months?" Stars are generally small in scope, with a handful of dimensions surrounding a single fact table. But one day, a colleague stops by your desk with a printout of a star schema that makes you jump right out of your seat. It is a sales super star! More than 1,000 data elements from the sales area are on this model. You learn that this star was produced by placing every metric data element from sales in the fact table and every nonmetric data element in a dimension. The fact table contains more than 100 different metrics and there are more than 30 dimensions! You are even more shocked to learn that this star schema is in production!

The Challenge

The sales star schema is experiencing some performance and space issues, and your colleague would like advice on how best to restructure this model. What is your response?

The Response

Many database problems, such as performance and space, have as their root a gap in analysis. In this scenario, the gap is not fully understanding the scope or subject matter of the original requirements. Once the requirements are fully understood and properly scoped, a variety of different modeling techniques can enter the picture to reduce the super star into a series of fast and user-friendly stars.

Understanding the Requirements

A star needs to be defined by a set of business questions and not by a broad functional area such as sales. Ideally, a representative set of the business questions within the sales area can be identified and documented, and the metrics behind these questions can be grouped together into efficient stars. The metrics can be grouped by usage or by business area.

Grouping metrics by usage means assigning metrics to stars based on common reporting and queries. Nishant Upadhyay, data architect, would analyze usage by studying the reports generated from this super star. "This will enable me to categorize the reports into sales lifecycle reports, sales process reports, pipeline sales reports, etc. The super star can be reviewed to see if it can be broken up into these subject areas to make them more manageable and focused." Michael Workman, systems analyst, would interview business users and create a series of stars according to different audiences.

Grouping metrics by business area means breaking the sales function down into its business processes. Maurice Frank, data modeler, would begin by investigating the sales business processes. "Sales is very general. Break it down into smaller business processes and map the measures and dimensions to each process which would have its own fact table." Mike Nicewarner, data analyst, supports this approach by saying, "Divide and conquer! Instead of just sales, you have tables named outside sales or marketing efforts."

Applying Different Modeling Techniques

After understanding the requirements, design challengers suggested a number of modeling techniques to restructure this super star, including shifting simple calculations to the reporting tool, selectively applying indexes, re-evaluating history and restructuring dimensions.

Both Mustufa Kapadia, BI analyst, and Eve Halberg, BI team lead, stress the need to shift simple calculations to the reporting tool. Eve says, "I would review all metric elements to determine whether they are required. Base metrics are required; calculated metrics may be better handled by reporting tools, such as averages."

Indexes can be a quick fix to a performance issue when applied properly. Mustufa Kapadia would analyze indexes on both the facts and dimensions. He would also look for opportunities to reduce or offload the amount of history in the sales fact table. Another approach would be to store the oldest sales fact records at a higher level of granularity to save space and increase performance.

After understanding the requirements, the dimensions themselves can undergo a restructuring. Eve Halberg says, "Dimensions that can be merged are status or code dimensions. For performance, I usually look to create what I call 'key dimensions' for large dimension tables. For example, I broke my customer dimension into a customer dimension and customer key dimension. The key dimension contains the elements that are frequently accessed in reports. Also, sometimes I've seen metrics in a sense repeated, such as checking account deposit amount and savings account deposit amount. I would replace these with one amount and create a key dimension that types the amount."

If you would like to become a design challenger and have the opportunity to submit modeling solutions, go to www.stevehoberman.com/designchallenge.htm and add your email address. If you have a challenge for us, please email me a description of the scenario at me@stevehoberman.com.

Steve Hoberman is one of the world's most well-known data modeling gurus. He taught his first data modeling class in 1992 and has educated more than 10,000 people about data modeling and business intelligence techniques since then. Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!), and organizations around the globe have brought Steve in to teach his Data Modeling Master Class, which is recognized as the most comprehensive data modeling course in the industry. Steve is the author of "Data Modeling Made Simple," "Data Modeler’s Workbench" and "Data Modeling for the Business (Technics Publications). He is the founder of the Design Challenges group and inventor of the Data Model Scorecard.

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.