Continue in 2 seconds

What are the reasons one would choose data warehousing over a relational design?

  • August 01 1999, 1:00am EDT


What are the reasons one would choose data warehousing over a relational design?


Doug Hackney's Answer: The key issues are:

  • ease of use
  • level of user hand-holding
  • size of user base
  • response time
  • server/RDBMS workload

Relational designs are perfect for OLTP and nearly useless for user-access decision support purposes. A normalized design forces the users to be extremely knowledgeable about the data, its location, its structure, join types, etc. In short, if you are not a high-end knowledge worker steeped in the arcane details of the data and the structure of the database, it's nearly unusable. Standard dimensional designs used to manifest the data to DW system users, in contrast, are marked by their intuitive nature and ease of use.
This high level of required user knowledge for relational systems leads to two critical issues related to the sustainability of the DW system. The first is that because the system requires a very high level of knowledge of the data, its structure, join types, etc. you will invest a huge amount in user support. Users typically won't use the system at the frequency needed to retain the level of expertise required to be proficient. Instead, they will call support. You might as well just assign a support person to co-locate with each user and get it over with.

The second issue is if only the experts can use the system, you will, over time, have a very small number of users. After the first wave of enthusiasm passes, average rank and file users will abandon the system, claiming it is too difficult to understand and use. The handful of experts left using the system will prove to be inadequate when new leadership rolls in and decides to amortize the cost of the data warehouse across the user base. This is a very unpleasant experience to endure, and I encourage you to build a system that supports the widest possible user base in order to avoid this fate.

In addition to these issues, a relational design also places undue loads on both the server and the RDBMS due to its requirement to have a multitude of joins to answer even the most simple business question. You end up spending a ton of money on very powerful servers that nearly collapse under the strain of forcing the relational model to do something it just isn't ever going to be able to do well. Your users experience excruciatingly slow response, they abandon the system, and your funding dries up. Not a pleasant outcome, and another scenario to be avoided.

There's a reason the industry uses denormalized and dimensional designs. It's because they work and they are sustainable.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access