In my recent columns, The Accidental Architecture and Recovering from the Accidental Architecture, I discussed whether people understood the why of best practices, such as dimensional modeling and hub-and-spoke architecture. In Dont Stop at How, Learn Why, I discussed the most common problems when people are too hub myopic, cloning their source systems or assuming the hub will support all reporting and analytic needs.
There are also dangers with being too spoke-centric (with data marts or cubes) and failing to establish the hub, or data warehouse (DW), as the enterprise data backbone.
The Problem with Designing Only Spokes
The flip side of cloning the source systems is cloning the tables or files used to generate reports or perform analysis. This is the classic trap that people designing a business intelligence (BI) system fall into when they are spoke myopic.
When an architect designs a house, she starts by examining the site, the style of house, the overall size and what rooms the owner wants. The design gets more detailed as the owners family gives its input on the layout and each room. Eventually, they approve a plan and build the house.
For the most part, this is similar to IT working with the business to determine what data and reports it needs to build. IT works with the business to define the reports and then builds the DW and BI reporting.
But there is one big difference. Before an architect hands the blueprints to a contractor, she designs and draws the infrastructure. This includes the foundation, framing, heating/cooling, roof, plumbing, cabling and wiring. Although it seems obvious that the entire infrastructure must be designed and built into a house before anyone starts pouring concrete, that logic is often missing when it comes to building data warehouses.
Rather than building the hub-and-spoke design, some architects go right to the spokes, i.e., the data structures that support specific reports. That may work short term, but what happens when the business needs additional reports with expanded data fields, different business logic and performance metrics? IT is forced to do a costly and lengthy project to build the next spoke from the ground up - from the source systems to the spoke. You are forced to go to the source systems because you did not lay the infrastructure in the DW.
In addition, getting this new data and implementing the new transformations often forces changes in the initial reports. This ripple effect occurs because the existing reports were built from spokes built directly for the initial design - a design that had certain data, transformation and performance metrics built in that do not apply to every business situation.
This design trap most often occurs when BI developers drive the data architecture. BI developers are focused on specific reporting and analytical requirements rather than a broader design that creates a holistic, enterprise data backbone. Its the classic problem of not seeing the forest through the trees.
The Hub and Spoke
The design traps people fall into are just like the story of Goldilocks and the Three Bears; theyre either too hot or too cold. With an overemphasis on either the hub or the spoke, the BI solutions are either too hard to use (hub centric) or too tough to change or expand (spoke centric).
The DW should be your data distribution hub for data marts or cubes for business reporting and analysis. As with any distribution hub, the DW exists to service the spoke. The warehouse needs to be flexible and scalable to support the changing and expanding needs of its spokes.
You need to design your data marts or cubes from the bottom up. This involves getting detailed business specifications on the data, transformation and metrics. These specifications are narrowly confined to the business users and processes that provide input. Once you have those specifications for your spokes, you need to incorporate them as input into your top-down DW design. You need to step back and take a more holistic enterprise view for your DW specifications. With the hub-and-spoke approach, you need to load enterprise data into the hub; business reporting or function-specific processing needs to take place from the hub into the spokes.
Resist the Urge
There are two major reasons for the spoke-centric approach. First, the design is often driven by BI developers who live to develop reports. These folks are often very tactical and delivery focused. It is common for the BI developers to concentrate on the immediate deliverables and lose sight of the value of the hub.Second, the urge to get things done is often associated with coding, not designing. When the project starts, business people and IT management are waiting for reports to be delivered, so spending time on the DW design slows down report development. The BI developers can create reports so quickly that people believe if the report works, then all the data is fine. Unfortunately, many have learned the hard way what is behind the scenes: the DW makes a significant contribution to the long-term success of your BI solutions.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access