How quickly things become conventional wisdom. The other evening, I was having dinner with some executives, and the topic was data warehousing. One executive had just changed companies and was in the midst of selling the merits of data warehousing to his organization. The gentleman had successfully built a data warehouse in a previous organization and was charged with bringing that success with him into this new environment.

The gentleman said that he was receiving resistance to data warehousing because the senior executives knew that data warehousing was a failure. Every time he broached the subject, the other executives spoke of the failure rate of data warehousing which they thought was approximately 70 to 80 percent. It was simply conventional wisdom that the vast majority of data warehouses were failures. Everybody knew that. Or did they?

Alarmed at the prospect of something as promising as data warehousing being considered a failure, I called my good friend Claudia Imhoff whose consulting company specializes in data warehousing. I asked Claudia how many failures she had seen. She informed me that she had seen many successes and she had seen some warehouses that had delivered less than what had been promised, but she couldn't point me to any failures. I called Mike Schroeck of PricewaterhouseCoopers and asked him about data warehouse failures. He didn't know of any, but he knew of plenty of successes. I called the consultants at Computer Associates, and they couldn't find any failures. If there really was a 70 to 80 percent failure rate, these failures were well hidden.

In continued search of substantiation for this conventional wisdom, I got on the Internet and found this failure rate quoted in presentations. I contacted the presenters who aquired their numbers from an official survey taken in Europe. I asked one presenter exactly whose survey documented this huge failure rate, and he pointed me to the Data Warehouse Network and their annual survey.

Next I contacted Data Warehouse Network which has now merged with the Business Intelligence Division of Sybase. They said they discovered this 70 to 80 percent failure rate among data warehouse developers by asking the following question: From the time of the initial design of your data warehouse, has the design or the technology supporting the data warehouse changed? If a respondent answered yes, the data warehouse was considered to be a failure. A failure! Over 70 percent of the people answered yes to this question. Data Warehouse Network continued to say that over 95 percent of the people who had changed their design continued with the warehouse development effort.

This interpretation of what constitutes a data warehouse failure is simply wrong – dead wrong. The very essence of data warehouse development involves iterations. It is probably safe to say that every mature warehouse has changed significantly from its original design which – according to this definition of a failure – makes every data warehouse development effort a failure. In tagging a normal part of data warehouse development a failure, Data Warehouse Network has perpetrated a great disservice to data warehousing because the failure rate is nowhere near 70 to 80 percent.

Changing the design or supporting technology of a data warehouse in no way constitutes a failure. Instead, making changes to the design of a data warehouse usually signals progress. This misguided survey has created the basis for conventional wisdom, and that conventional wisdom is wrong.

A large and respected industry group later conducted another survey to determine the success and failure rates of data warehousing. The group asked: Has your company had a data warehouse project which has been abandoned for more than six months with no active work done on the development of the warehouse in that time? Less than five percent of companies that responded had abandoned their warehouse development effort for more than six months. Now, abandoning a warehouse development effort for more than six months constitutes a failure. Changing your mind about some aspect of the design of a data warehouse is not a failure. There is a huge difference between a 70 to 80 percent failure rate and a five percent failure rate. A huge difference.

Unfortunately, it wasn't just an obscure European survey that got it wrong. The mistake was magnified and repeated because it was spotted and used by other organizations in an insidious manner. Some other organizations use this misinformation as a selling tool. So, who is it that is filling the air with gross misinformation? And – more to the point – why are they telling this little white lie?

There is a whole class of vendors that considers the data warehouse to be an obstacle to a payday. For example, consider the sales cycle of a data mart vendor. A data mart vendor goes to a trade show and finds a prospect. The prospect becomes enamored of the data mart software. The data mart vendor pays a visit to the client, and enthusiasm for the data mart product grows. Then, a database administrator in the back of the room says, "We like your product, but in order to use your product most effectively we have to build a data warehouse. Come back in a year."

What does any self-respecting salesperson do? Eliminates the database administrators from the decision-making process, of course. If the database administrators have their way, the data mart vendors would close their sales a year from now, not this quarter. However, a year from now, the clients may be enamored of another data mart product. Therefore, the data mart salespeople look upon anyone suggesting a data warehouse as an enemy to be eliminated. What better way to eliminate people than to completely discredit them?

How does the data mart salesperson discredit the database administrator (or any other person who gets in the way of the sales process)? The easiest way to do this is to say that a credible industry source has stated that there is a 70 to 80 percent failure rate for data warehouses – regardless of the truth or reality. The vendors will do anything to make a sale now, rather than tomorrow – and usually this works. The myth that data warehouses fail at a high rate has become conventional wisdom. In this case, however, conventional wisdom is simply not true. In fact, it is not merely untrue – it is grossly untrue.

There are, of course, other tactics used by the salespeople when it appears that an organization might actually be getting ready to build a data warehouse. In an attempt to dissuade an organization from building a data warehouse and thus extending the sales cycle, salespeople may say:

  • But a data warehouse is so big – there is so much data in it.
  • You're going to have to get and cleanse a bunch of old legacy data.
  • You could just start small, with a data mart, and grow that data mart into a data warehouse.
  • The big-bang approach to data warehousing development never works.

It is true there is a lot of data in a data warehouse; but handled properly, the volume of data does not have to eat you out of house and home. It is also true that old legacy data must be accessed, cleansed and conditioned, but is that such a bad thing? Shouldn't that data have been cleansed a long time ago? Is your data mart vendor telling you that you ought to build a data mart on old, invalid data? As for starting small with a data mart, a data mart is a data mart at any size, and it will never turn into a data warehouse. Finally, the data mart enthusiasts are absolutely correct in saying that the big-bang approach never works. From the very beginning, good form for data warehouse development has been to build things iteratively. Proper data warehouse development is not to build things in a big- bang fashion.
For some people, truth doesn't matter. If you are one of those people, go ahead and listen to conventional wisdom. However, if you care about truth and if you care about what your vendor is telling you, you must be sure your vendor is properly presenting the facts.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access