The Challenge
The sales star schema is experiencing some performance and space issues, and your colleague would like advice on how best to restructure this model. What is your response?
The Response
Many database problems, such as performance and space, have as their root a gap in analysis. In this scenario, the gap is not fully understanding the scope or subject matter of the original requirements. Once the requirements are fully understood and properly scoped, a variety of different modeling techniques can enter the picture to reduce the super star into a series of fast and user-friendly stars.
Understanding the Requirements
A star needs to be defined by a set of business questions and not by a broad functional area such as sales. Ideally, a representative set of the business questions within the sales area can be identified and documented, and the metrics behind these questions can be grouped together into efficient stars. The metrics can be grouped by usage or by business area.
Grouping metrics by usage means assigning metrics to stars based on common reporting and queries. Nishant Upadhyay, data architect, would analyze usage by studying the reports generated from this super star. "This will enable me to categorize the reports into sales lifecycle reports, sales process reports, pipeline sales reports, etc. The super star can be reviewed to see if it can be broken up into these subject areas to make them more manageable and focused." Michael Workman, systems analyst, would interview business users and create a series of stars according to different audiences.
Grouping metrics by business area means breaking the sales function down into its business processes. Maurice Frank, data modeler, would begin by investigating the sales business processes. "Sales is very general. Break it down into smaller business processes and map the measures and dimensions to each process which would have its own fact table." Mike Nicewarner, data analyst, supports this approach by saying, "Divide and conquer! Instead of just sales, you have tables named outside sales or marketing efforts."
Applying Different Modeling Techniques
After understanding the requirements, design challengers suggested a number of modeling techniques to restructure this super star, including shifting simple calculations to the reporting tool, selectively applying indexes, re-evaluating history and restructuring dimensions.
Both Mustufa Kapadia, BI analyst, and Eve Halberg, BI team lead, stress the need to shift simple calculations to the reporting tool. Eve says, "I would review all metric elements to determine whether they are required. Base metrics are required; calculated metrics may be better handled by reporting tools, such as averages."
Indexes can be a quick fix to a performance issue when applied properly. Mustufa Kapadia would analyze indexes on both the facts and dimensions. He would also look for opportunities to reduce or offload the amount of history in the sales fact table. Another approach would be to store the oldest sales fact records at a higher level of granularity to save space and increase performance.
After understanding the requirements, the dimensions themselves can undergo a restructuring. Eve Halberg says, "Dimensions that can be merged are status or code dimensions. For performance, I usually look to create what I call 'key dimensions' for large dimension tables. For example, I broke my customer dimension into a customer dimension and customer key dimension. The key dimension contains the elements that are frequently accessed in reports. Also, sometimes I've seen metrics in a sense repeated, such as checking account deposit amount and savings account deposit amount. I would replace these with one amount and create a key dimension that types the amount."
If you would like to become a design challenger and have the opportunity to submit modeling solutions, go to www.stevehoberman.com/designchallenge.htm and add your email address. If you have a challenge for us, please email me a description of the scenario at me@stevehoberman.com.
Steve Hoberman is one of the world's most well-known data modeling gurus. He taught his first data modeling class in 1992 and has educated more than 10,000 people about data modeling and business intelligence techniques since then. Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!), and organizations around the globe have brought Steve in to teach his Data Modeling Master Class, which is recognized as the most comprehensive data modeling course in the industry. Steve is the author of "Data Modeling Made Simple," "Data Modelers Workbench" and "Data Modeling for the Business (Technics Publications). He is the founder of the Design Challenges group and inventor of the Data Model Scorecard.










Be the first to comment on this post using the section below.