We are already deep into the data warehouse boom. Nearly every organization worth its salt has one. Some have multiple warehouses and are continuously trying to integrate them.
In the early 2000s, data warehouses were the iPhones of the IT world: everyone had to have one. They were cool, great for snazzy demos, could be piggybacked by a bunch of applications and cost a truckload of money. There was no shortage of reasons for having a data warehouse, and some of the business cases made for wonderful reading.
Nearly a dozen years down the line, not a lot has changed on this front. CIOs still love data warehouses, but there have been other developments that need to be added into the business case.
- Advancements in several technologies have provided CIOs with alternatives to the traditional data warehouse ecosystem.
- The financial slowdown has sent IT managers scrambling to find funds to support a long-term investment such as an enterprise data warehouse.
- Data warehouse technologies and methodologies have evolved to the extent that it is a whole different ballgame. The volume and range of technologies, hardware and software involved require a long time to master. The risk of a wrong turn has multiplied.
- Various government regulatory bodies have increased their focus on IT and data retention.
Data Warehouse Failures
One quick fact demonstrates why it is absolutely vital to get the right answer to the classic question, “Do we need an EDW?”
Studies and surveys indicate varying and sometimes relatively high data warehouse failure rates – some reporting in upwards of 50 percent.
The inference is clear, even if the numbers aren’t. Statistically, your data warehouse is more likely to fail than succeed. This is not to say that data warehouses aren’t needed or useful. In fact, it is just the opposite. The benefits of a well-designed data warehouse are well-documented (and aren’t the subject of this article). The reasons for failure are varied, but rarely lie with the technology. More often than not, the seeds of failure are sown during conception: Someone simply failed to come up with a realistic business case for the data warehouse.
It is worthwhile to take a step back and re-examine the whole question.
When conceptualizing data warehouses, a common mistake is to look at it as a solution. Organizations expecting to magically solve business problems (e.g., improving sales and reducing costs) by inserting a data warehouse in their enterprise are in for a not-so-pleasant surprise. Business problems are solved by people, not technologies. A data warehouse is just a step in the solution. In its simplest definition, it is a store of integrated, non-volatile data. Everything else is just an extension, leveraging the various technologies available. Organizations also forget that real solutions often involve a change in processes, cultures and mindsets. With that in mind, it becomes very easy to understand the high EDW failure rates reported. Simply put, expectations placed on data warehouses are not valid because users attribute functions to the data warehouse that it does not perform.
Data Warehouse Expectations
It is fair to ask, “What can I expect out of my enterprise data warehouse?”
Gimmickry and marketing language aside, you should expect your enterprise data warehouse to be able to offload certain resource-heavy activities, like reporting, regulatory data retention and data sharing, from your mission-critical transactional systems. This also has hidden benefits: By offloading these activities, your physical database structures aren’t burdened by the modeling, indexing, hardware and support personnel constraints associated with them. In certain scenarios, this by itself can be a huge benefit, often repaying the money invested into the data warehouse several times over. You should also expect to have a repository of cleansed, standardized and integrated historical data from multiple transactional systems that can support advanced analysis and decision support applications. Additionally, implementation of an EDW is a good opportunity to provide a boost to an organization’s data governance efforts, and gain more control of data security and more flexibility for your information management department.
Data delivery, or reporting, and analytics fall under the scope of business intelligence, and this term is closely related to another common misconception: A data warehouse does not include BI. Without question, a well-designed data warehouse is a boon for your BI initiatives, but technology today has advanced to the point that you can have a full-fledged BI solution in place without having a data warehouse. There are several tools that leverage technologies like in-memory analytics and data federation in order to facilitate analytics without a physical data warehouse. There are also architectures that take a hybrid approach and perform well. For example, if you need real-time BI, you could adopt a physical data warehouse for the historical data and data federation for real-time information, feeding both into an appropriate visualization tool.
So, Do You Need An EDW?
Now, we come back to the question, “Do you need an EDW?” Unfortunately, the short answer still is “It depends,” and the accuracy with which the proponents for a data warehouse in the organization can elaborate on this determines the success or failure of the data warehouse. There are some important considerations to help with the assessment.
Are you a large organization with multiple business processes, locations and operations, generating a large amount of data that needs to be consumed actively by your employees? Nine out of 10 cases like this will definitely need an EDW but you may find a reason to avoid splurging on one (keep reading). In fact, the size of the organization has more to do with the cost of the EDW than with the need for the EDW.
Have you analyzed your IT landscape recently? If you can count the number of applications and technologies on your fingers, you probably do not need an EDW. You could look toward virtualization and federation technologies to support your BI needs. For data retention, the venerable tape archive will suffice.
Are you in a business that frequently needs to refer back to historical data? Retail and financial institutions often use historical data to predict customer behavior. It is no surprise that most breakthroughs in data warehousing and BI have been brought about by implementations in such organizations. If your business needs heavy analytics, forecasting and data mining, a well-designed EDW is a must for you. On the flip side, a static business with little need for analyzing historical data - rare, it must be mentioned, in current times - does not need to invest in one.
If one of the needs driving the business case for your EDW is self-service, enabling ad hoc data access, empowering employees or reducing IT workload, it would be wise to think again before proceeding. Nearly all data warehouse implementations include these in their requirements, and post-implementation, the usage numbers are discovered to be embarrassingly low. Small-scale BI search applications may serve your purpose better than a large-scale EDW. It is also common knowledge that loading and maintaining data warehouses are complex processes. The increasingly common trend of storing EDW data at the transactional granularity has addressed the question of load complexity, but the resulting increase in data volume and scope offsets that. Either way, it is a misconception that an EDW reduces IT support needs.
If you are a relatively new business, with only the latest technologies and applications built to modern collaborative standards using standardized processes, you can perhaps do without a data warehouse. There are tools available that can virtually integrate all your data and provide you with a consolidated view for your BI needs. If you have built your applications correctly, there may be a reduced need for data cleansing.
Building an EDW is not only a costly proposition, but a very lengthy one. Whole books have been written on the question of ROI and how to architect EDWs in order to reap their benefits as soon as possible. Simply put, the shorter your implementation cycles, the more likely you are to incur massive costs on rework and tweaking. On the other hand, a year is a long time in a technology’s lifecycle. You would have to factor in costs for upgrading both software and hardware once you are done. If you are not prepared to do this or if your business is so dynamic that waiting a year is not an option, you would be well-advised to steer clear of EDWs and focus your attention on other approaches.
There are many other considerations, but one often overlooked is the human aspect. Sometimes an organization is just not ready to take advantage of the possible benefits of an EDW for various reasons, such as work culture, established processes, interest levels, knowledge of business and technical prowess. Investing a huge amount of money in hopes of sparking an IT revolution in your organization is a bad idea. Organizations should build an EDW only when the need has been irrevocably established, preferably by a small scaled pilot.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access