Free Site RegistrationFree Site Registration

Too Narrowly Focused

Data Integration Adviser

Information Management Magazine, August 1, 2008

Rick Sherman

In my recent columns, “The Accidental Architecture” and “Recovering from the Accidental Architecture,” I discussed whether people understood the “why” of best practices, such as dimensional modeling and hub-and-spoke architecture. In “Don’t Stop at How, Learn Why,” I discussed the most common problems when people are too hub myopic, cloning their source systems or assuming the hub will support all reporting and analytic needs.

Advertisement

There are also dangers with being too spoke-centric (with data marts or cubes) and failing to establish the hub, or data warehouse (DW), as the enterprise data backbone.

 

The Problem with Designing Only Spokes

The flip side of cloning the source systems is cloning the tables or files used to generate reports or perform analysis. This is the classic trap that people designing a business intelligence (BI) system fall into when they are spoke myopic.

When an architect designs a house, she starts by examining the site, the style of house, the overall size and what rooms the owner wants. The design gets more detailed as the owner’s family gives its input on the layout and each room. Eventually, they approve a plan and build the house.

For the most part, this is similar to IT working with the business to determine what data and reports it needs to build. IT works with the business to define the reports and then builds the DW and BI reporting.

But there is one big difference. Before an architect hands the blueprints to a contractor, she designs and draws the infrastructure. This includes the foundation, framing, heating/cooling, roof, plumbing, cabling and wiring. Although it seems obvious that the entire infrastructure must be designed and built into a house before anyone starts pouring concrete, that logic is often missing when it comes to building data warehouses.

Rather than building the hub-and-spoke design, some architects go right to the spokes, i.e., the data structures that support specific reports. That may work short term, but what happens when the business needs additional reports with expanded data fields, different business logic and performance metrics? IT is forced to do a costly and lengthy project to build the next spoke from the ground up - from the source systems to the spoke. You are forced to go to the source systems because you did not lay the infrastructure in the DW.

In addition, getting this new data and implementing the new transformations often forces changes in the initial reports. This ripple effect occurs because the existing reports were built from spokes built directly for the initial design - a design that had certain data, transformation and performance metrics built in that do not apply to every business situation.

This design trap most often occurs when BI developers drive the data architecture. BI developers are focused on specific reporting and analytical requirements rather than a broader design that creates a holistic, enterprise data backbone. It’s the classic problem of not seeing the forest through the trees.

The Hub and Spoke

The design traps people fall into are just like the story of “Goldilocks and the Three Bears”; they’re either too hot or too cold. With an overemphasis on either the hub or the spoke, the BI solutions are either too hard to use (hub centric) or too tough to change or expand (spoke centric).

The DW should be your data distribution hub for data marts or cubes for business reporting and analysis. As with any distribution hub, the DW exists to service the spoke. The warehouse needs to be flexible and scalable to support the changing and expanding needs of its spokes.

You need to design your data marts or cubes from the bottom up. This involves getting detailed business specifications on the data, transformation and metrics. These specifications are narrowly confined to the business users and processes that provide input. Once you have those specifications for your spokes, you need to incorporate them as input into your top-down DW design. You need to step back and take a more holistic enterprise view for your DW specifications. With the hub-and-spoke approach, you need to load enterprise data into the hub; business reporting or function-specific processing needs to take place from the hub into the spokes.

Resist the Urge

There are two major reasons for the spoke-centric approach. First, the design is often driven by BI developers who live to develop reports. These folks are often very tactical and delivery focused. It is common for the BI developers to concentrate on the immediate deliverables and lose sight of the value of the hub.

Second, the urge to get things done is often associated with coding, not designing. When the project starts, business people and IT management are waiting for reports to be delivered, so spending time on the DW design slows down report development. The BI developers can create reports so quickly that people believe if the report works, then all the data is fine. Unfortunately, many have learned the hard way what is behind the scenes: the DW makes a significant contribution to the long-term success of your BI solutions.

Rick Sherman has more than 20 years of business intelligence and data warehousing experience, having worked on more than 50 implementations as a director/practice leader at PricewaterhouseCoopers and while managing his own firm. He is the founder of Athena IT Solutions, a Boston-based consulting firm that provides data warehouse and business intelligence consulting, training and vendor services. Sherman is a published author of over 50 articles, an industry speaker, a DM Review World Class Solution Awards judge, a data management expert at searchdatamanagement.com and has been quoted in CFO and Business Week. Sherman can be found blogging on performance management, data warehouse and business intelligence topics at The Data Doghouse.You can reach him at rsherman@athena-solutions.com or (617) 835-0546.

In addition to teaching at industry conferences, Sherman offers on-site data warehouse/business intelligence training, which can be customized and teaches public courses in the Boston area. He also teaches data warehousing at Northeastern University 's graduate school of engineering.

For more information on related topics, visit the following channels:

Advertisement

Advertisement