Justifying data warehouse investments poses a dilemma for many organizations. Using instinct with marginal quantitative estimates has been the norm for years as many found the data warehouse return on investment (ROI) difficult to estimate and measure. However, mantras supporting this approach, such as "essential to the business," "the results are too unpredictable" and "yields intangible benefits," have become insufficient for many of today's upper-level decision-makers. An approach that has come into more favor recently is viewing the problem by asking the question, "How many more pants/policies/accounts/airline seats do we need to sell to pay for the data warehouse and will the data warehouse be responsible for the lift in sales or reduction in expenses necessary to generate ROI?" Some projects are also being scoped at a data mart level more conducive to quick payback and ROI measurement.

The largest study of data warehouse ROI, conducted by International Data Corp. in 1996, touts an average (sans outliers) 401 percent three-year return on investment for 62 companies with active data warehouse projects. The more interesting part of the study was the dramatic range of reported ROIs ­ as low as -1,857 percent and as high as 16,000 percent. A data warehouse project could hardly be justified with the quotation of statistics. The track record is uneven at best, and it is still hard for decision-makers to ascertain what makes a data warehouse successful.

Many data warehouses have been justified by mandate based on widely held beliefs that data warehouses are a good thing. Drivers include better data quality, the gaining of competitive information, trend analysis, easier access to data, uniform data and support efficiencies. In many cases, you can simply trace back to the business not knowing needed data is available, not knowing where the data is or being unable to get to it. Underlying these ambiguous concepts are usually more specific, unambiguous measures. More often, these underlying measures are being surfaced, and seller beware if justifying a data warehouse without them. More frequently, data warehouse justification is being done with ROI; and, in cases, where ROI is not done for justification and is requested at a later point in the project's life cycle, it will be much more difficult to measure.

Clearly, some things don't work when it comes to justifying an expenditure on a data warehouse project. Changing architectures, clean data and moving off the mainframe need more quantification. Sales and marketing managers relate to sales volume. Operations management wants to reduce expenses and is concerned with inventory management. Executive management wants profits, market share, improved time to market and ability to identify new markets. These are all related to ROI at some level.

Once the need is uncovered, what must be understood is what key benefit of the data warehouse combines with the key need of the decision-maker as it relates to ROI. Data warehouse efforts often can and should be justified through a comparison of estimated ROI to other uses for the funds. Several years ago it was difficult to find a data warehouse project that had attempted to ascertain its cost/benefit. Recently, with the internal Year 2000 competition for funding and resources and various other factors including some high-profile failures, it is becoming expected that data warehouse projects be justified. Relationships between the CEO and the sponsor and/or the CIO that are still maturing also exacerbate the need for specificity.

Technology Supported Projects

Projects that use technology are not fundamentally different from other business investments. It is important to know how data warehouse efforts actually relate to corporate income and expenses. The methods come to light by redundantly drilling down on the simple question: Why are you doing that? Sometimes the manner in which we discover increased income and/or the way a reduction in expenses is being approached with a data warehouse are dubious, and a different course of action may be a more effective approach to determining ROI.

Data warehouse ROI is about accumulating all these returns and investments in the "why" chain from the data warehouse build, maintenance and associated business and IT activities through the ultimate desired result, considering all possible outcomes and their likelihood. Using ROI for justification is reducing the proposed net change in activities to their associated anticipated cash flow. Often, a cost of money is used to reflect in today's numbers the present value of expected cash flows in the future.

While it would not be reasonable to attempt to measure all benefits ­ large and small ­ resulting from the data warehouse (such as paper saved by making reports available online), the only uncalculated estimates and results are those that you are unaware of or choose not to measure due to the time, expense and difficulty involved in doing so. So-called intangible benefits are benefits it has been determined to be not reasonable to measure. There are many such benefits ­ the feedback that the operational systems receive, getting disparate business units to work together, creating a leading- edge image, getting faster access to data, and data quality ­ which usually arrive with the data warehouse that are too indirect to be used as the primary measurement for data warehouse justification.

If you were simply selling a subscription service to the data warehouse, the estimated returns would be the estimated number of customers times the subscription rate. This data warehouse has a direct benefit. Most data warehouse returns are not as clean. Returns are indirect, sometimes at multiple levels of indirection, and always with multiple possible outcomes. Costs include hardware, software, network, IT personnel and business personnel. Assuming data warehouse ROI can stand based on direct benefits matched up solely against technology costs, when you often need to add in associated business costs and returns, is a common problem.

Unlike the gambler in Las Vegas rolling dice and spinning roulette wheels with a vague understanding of the odds, the successful data warehouse promoter looks at the possible business results, at the odds of winning, calculates the amount to bet and counts his chips at the end of the day in order to understand if his methods need to be recalculated. Return on investment is the common language, much like using the cash system as opposed to the barter system.

Defining the Data Warehouse

A data warehouse is literally a combination of subject areas, data sources, user communities, business rules to be applied and architecture. Mix and match these components, including the degree each will be covered, with a phased approach for maximum effectiveness. Each phase should have ROI justification, and each phase should be able to stand on its own. However, since phases build on one another, the ROI used for justification is unlikely to ever be hit on the mark. The ROI of the entire project could exceed any estimated ROI as future phases add to the ROI of the data warehouse project as a whole.


Figure 1: Data Warehouse Phase ROI Estimates

Data warehousing is a compromise. Data warehouses can be any combination of the components. Different possibilities need to be considered before deciding on the makeup of a phase. Each candidate combination has its own ROI formulation, including its own assumptions and possible outcomes. See Figure 1. There are at least three reasonable ROI possible outcomes to develop for a data warehouse phase ­ high success, mild success and failure. Each outcome has a percentage of probability associated with it. These numbers must be generated by the business, often with IT playing the role of catalyst and consultant. The possible scenarios form probability distributions.

Return on Investment

ROI has to take the complete current state into consideration and only be measured on the delta. For example, if it is for a second or later phase of a data warehouse, the installed phases are part of the current state. Data warehouse phases should build on former phases, with each satisfying a business problem or preparing the way for a larger problem to be solved in the next phase.

ROI estimates and measurements must be quoted with an accompanying time duration to be meaningful (i.e., 58 percent ROI in three years). The reason is simple ­ cash flows change over time. All things being equal, maintenance costs should decrease and benefits should increase over time, yielding an ROI that grows over time. At some point, you may assume straight line cash flows, but this is usually at a point in the future (5+ years) beyond which most CFOs will accept analysis.

Most data warehouse phases will provide one or a small number of key organizational ROI benefits. When data warehouses, as many have, replace outdated and non-integrated stovepipe systems or automate manual processes, tracking down numbers is easier. Such expense-reducing data warehouse projects find their benefit in the form of reduced carrying costs for the older system, a reduction in marketing costs through targeted marketing programs or a reduction in manpower required to carry out required business tasks.

Other expenses that can be reduced with data warehouse projects are losses due to fraud detection and write-offs due to having inadequate data to combat challenges from vendors and customers. In healthcare, a reduction in claims may be targeted. The overproduction of goods and commensurate inventory holding costs can also be minimized with data warehouse efforts.

However, many of today's successful data warehouses often go beyond cost savings and enable business transformation through technology. This is more difficult to estimate and measure as the possible outcomes are much more numerous and harder to predict. The strategic focus of many of these warehouses has often used vaguely defined corporate metrics that are only recently becoming part of the corporate vocabulary. Examples include customer retention using a customer's lifetime value as the measure; targeted marketing that looks not to save on marketing costs, but increase the customer base; promotion analysis programs that, likewise, look to increase customer number and penetration; and lowering time to market to take advantage of market conditions. Data warehouses that target a combination of cost saving and/or increasing income are also becoming popular as many data warehouses are replacing costly legacy systems that do inadequate jobs at producing ROI.

Avoid the "killer app" mind-set where a use for the warehouse that will justify its existence will be found. Know in advance what forms the "killer app" may take on and how that may reflect in ROI.

Arriving at consensus estimates of return is the most difficult challenge for data warehouse justification with ROI. No matter the sponsorship or housing of the data warehouse internally, it is a business project supported by IT, not the other way around. Therefore, ultimately the business is responsible for the ROI estimation. IT can support and even lead the effort by providing the proper questions that need to be asked of the business and helping generate the answers through interactive sessions and controlled experiments. A prototype with actual business data helps to generate the answers needed to estimate ROI. Also the establishment of a corporate governance committee to facilitate the business driving the direction of the data warehouse effort can be a forum for confirming and refining ROI estimates.

The biggest problem in getting consensus estimates for ROI is concerns over potential accountabilities associated with the surfacing of quantitative measures of performance. In cultures where qualitative factors play the largest role in determining management performance, justifying with ROI will be difficult at best.

ROI plays a vital role in determining the characteristics and approval of a data warehouse. If the data warehouse in question cannot increase revenue or decrease expenses, either in the short term or long term, and therefore generate positive ROI, then there may a different formulation of the data warehouse that is more effective. It is the task of many data warehouse promoters today to formulate the economic expectations and generate economic value for their data warehouse projects.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access