Continue in 2 seconds

Please help explain the 1) urgency of need and 2) elimination of risk for building a data warehouse.

  • Sid Adelman, Clay Rehm, Danette McGilvray
  • January 03 2006, 1:00am EST

We are in the process of building a data warehouse. We would like our project to be one our number-one priority and need to come up with a business plan. I have been asked to explain 1) urgency of need 2) elimination of risk. Can you help?


Sid Adelman's Answer: Urgency of need - Research what others in your industry are doing, especially your competition. Nothing motivates senior management more than the fear that your hated competitors are doing something better or smarter than you are.Elimination of risk - You will never be able to eliminate risk, but these are some of the things you can do to minimize that risk:

1. If the mission and the objectives for the DW have not been defined:

  • Identify the sponsor of the DW.
  • Insist (strongly recommend) that the mission and objectives be defined prior to any serious activity.
  • Develop a straw man for the mission and objectives and propose it to the DW sponsor.

2. If the mission and objectives of the DW do not map to those of the enterprise:

  • If there are no explicit enterprise objectives, there are probably assumed objectives to which most people in the enterprise would subscribe. These should be documented and mapped to the DW objectives.
  • If enterprise objectives exist but the DW does not support them, rethink what you are trying to accomplish with the DW.

3. When there are problems with the quality of the source data (there always are):

  • If the quality of the source data is unknown, use a quality evaluation tool to determine just how bad things are. Identify operational data with quality problems to someone high in the organization (perhaps the CIO), and then step back (it is not your responsibility to clean up dirty operational data).
  • Because the quality of the source data will be highly variable, try to convince the user to implement the cleanest data first (this will sometimes work).
  • If the user insists on putting dirty data in the DW, at least flag the data in metadata, indicating its level of quality.

4. To provide the skills to support the DW:

  • Define the functional responsibilities of data administrators, database administrators, application developers and user liaisons. Define the skill levels required for each of these positions.
  • Sell management on the need to have skilled people on the DW team.
  • Sell management on the need to have these people sufficiently dedicated to the project.

5. To be sure you have an adequate budget in place:

  • Compile industry publications, presentations, etc. that indicate what a DW will normally cost. Watch out for those who give figures for selected subsets of the effort or who disregard costs assigned to some other departments.
  • Itemize each of the costs for your project. Don't pad the numbers, but don't underestimate just because you think the true cost will frighten management into paralysis.
  • If the numbers are too high, consider a smaller project or one that does not require some big ticket items (such as a new DBMS, other expensive software, or major new hardware).

6. For supporting software (extract, cleansing, BI tools, DBMS, etc.):

  • Understand the benefit of this software to the project. If it does not benefit this specific project, justification can only be accomplished if major follow-on projects will significantly benefit from its use.
  • Quantify the costs of not using the software. These costs should include the additional effort to write the code, the ongoing costs to maintain the code, the costs of delay and the potential for reduced quality of the implementation.
  • Identify only the software that can make a major contribution. Avoid recommending a piece of software that is fun, leading edge and a resume enhancer, but does not significantly make a contribution to the project.

7. Focus on the source data:

  • If source data has been neither inventoried nor modeled, it is probably because IT management does not recognize the importance of these activities. Any such recommendations would probably be seen as delaying the project. In fact, the inventory and modeling effort is long and laborious. If management has not already recognized their benefits, it's unlikely that the DW project will sell it. The DW should not be used as justification for data modeling or for inventorying the data.
  • If a data modeling tool is in place that has reverse engineering capability (the ability to take database definitions [DDL], capture them in the data model encyclopedia and generate rough models), this reverse engineering could be the least costly and most acceptable course of action.

8. To be sure you have a strong, well-placed, reasonable user sponsor:

  • Take your time. Make a list of sponsors that match the above criteria, and put the strongest ones on top. Research their decision support requirements, and determine which problems could be well-served by the DW. Invite #1 to lunch, sell that user on the DW, outline what would be needed from them and from their department, and ask for their sponsorship.
  • If #1 is not agreeable, invite #2.
  • When you are down to The User from Hell, stop and do something else.

9. Focus on the primary business users of the data warehouse:

  • If your users are not computer literate, budget more money for user support.
  • Allow more time for the expected volume to be achieved. Readjust your expectations.
  • Revamp the training so as not to frighten the students.
  • Provide mentors in the training process.
  • Develop a more comprehensive set of predefined queries.
  • Choose an extra-user-friendly front end (choose warm and fuzzy over power and function).

10. Address the users' expectations for the DW:

  • Be honest. Don't misrepresent what the users will be getting, their required involvement, the costs or the schedules.
  • Never, never, never be coerced by anyone to accept unrealistic time frames or budgets.
  • Document what the users will be getting and when (some installations ask the users to sign this document).
  • Continue to remind the users of what they will be getting and when.
  • If you have a user who is unwilling to accept your estimates, give someone else the opportunity to work with that user.

Tom Haughey's Answer: First, I would strongly suggest starting by getting business managers to specify their goals, such as reduce customer attrition by x percent this year, decrease disability costs by y percent this year or improve customer cross-sell by z percent each year. You want to make sure your warehouse eventually supports these goals by storing the necessary data. A well-known practice is to ask managers for their top-ten questions or what answers they most need. Initially focus on big-ticket questions - questions that would have a significant payback. Try to get them to put a value on the answer to each question. Look at it this way: "If we give you the answer to this question, what is it worth to you?" See if they are willing to put a value on it. Nobody has to commit to the number but it makes the question tangible. For other examples, "If we could reduce customer onsite service calls by one call per customer per year, we could save x dollars (the cost of one visit per year times the total number of customers)." Or, "If we could anticipate potential disability accidents and thereby reduce disability payments by 1% per year, we could save xyz dollars." In fact, maybe they could even pay for the warehouse in one year!

Second, I usually create the DW strategy using two means: a simple process and a simple framework. The simple process consists of five steps. The time to complete these steps could be two to four months, depending on size, complexity and resources:

  1. Define goals: what business goals you need to achieve.
  2. Define future vision: what the warehouse will look like in the long term.
  3. Assess current state: what it looks like now.
  4. Determine gaps: find the holes that need to be fixed.
  5. Formulate a plan (perhaps over three years, to go from where you are to where you need to be).

The simple framework is called BIAT (business process, information, application and technology). Here is a description of BIAT:

  • Business Processes: Gross business processes or collections of elementary activities (e.g., implement campaign, accept customer contact, change customer data).
  • Information: Primary entities and, in some cases, subject areas (e.g., subject, product, purchased product, prospect). The data flows across the business.
  • Application: Collections of implemented processes and procedures.
  • Technology: Technology types (e.g., messaging, relational DBMS, etc.).

Collect information on BIAT during each of the five steps above.
In my view there is one major way to avoid risk - don't do anything risky. More seriously, as I have said in other places many times, there are three major ways to eliminate risk once you have gone through steps 1 through 4 as mentioned: namely, incremental delivery, delivery in short intervals and prototyping. [See more details in my response to this month's question on "the relevance of a work breakdown structure in a data warehouse project."] Long-term promises and far-out schedules are risky. Deliver short-term results but on a stable base; in essence, this means on a broad stable database and a robust platform - but (and this cannot be said too often) built piece by piece. When you test, test end-to-end. A warehouse has three major processes: gather, store and delivery. Test all three fully. For example, we need some aggregates. We design them and put them in. Queries run faster. However, creation of the aggregates blows our batch window by 15 minutes and the warehouse is late! End-to-end testing will discover this.

There are other factors. Do vendor evaluations (even conduct shoot-outs among vendors) and work only with dependable vendors. Look up their stats, including their financial stats. Keep the number of vendors and technologies to a minimum. Don't be afraid to get the vendors to contribute some (non-billable) assistance to your strategy project. There are no longer dozens of vendors to pick form in each area, such as ETL (extract, transform and load), BI (business intelligence) and DBMS (database management systems). You can find vendor evaluations in Gartner and all the other broad industry consultants. They can help you with your thinking, but I would not use them for the detailed work. They are very expensive for one thing. In choosing products, look not only for ease of use and performance, but also and very importantly for scalability. Can you add to the existing structure as needs grow? You want your DW to be a success, but you want it to survive it. If the DW becomes such a success that it cannot sustain the workload, then you have to re-platform, which will be costly and time-consuming.

Get some help from consultants who have been there and done that. As a manager, I preferred to use different opinions rather than rely on one large consulting vendor. Avoid those that seem to have obsessive attitudes about you must do this and that. If you are looking for a long-term solution, I would avoid deploying a lot of data marts as your main strategy, especially independent data marts. Data marts will play a role, and you might start with a data mart as your first project. However, the more cross-functional the information you need, the more centralized your solution should be. Consider the painful lessons learned from companies today whose primary DW strategy is data mart consolidation! Look also for those experts who have had success in helping other companies be successful along the same lines as your plan.

Danette McGilvray's Answer: Urgency of Need. The urgency of need depends on your business situation and why you are building the data warehouse. Answer the following questions:

  1. Who will use the data warehouse?
  2. What kind of information will be provided?
  3. What questions is the business asking and how will the information from the data warehouse help answer them?
  4. Who is asking those questions? This should identify the people in your company who will benefit from the data warehouse.
  5. Why are those questions important to the business?
  6. How are those business questions being answered today? How long and how many people does it take to answer them currently?
  7. How can the data warehouse improve the ability to answer the questions accurately and reduce the amount of time to assemble the information? Quantify current time and resources and compare to estimated time and resources with the data warehouse.

Research the answers to these questions. Then create the story or stories that explain the situation and shows the business need specific to your company. For instance, it comes as a surprise to many managers that what seems to them a simple request takes many people several days to produce. It's not unusual to hear that the request for a list of the company's top-ten customers along with what was purchased in the last six months requires four people one week's worth of time to prepare the report. The actual effort is often hidden from those requiring the information.
Elimination of Risk. As with any project, you can never eliminate risk when building a data warehouse. However, you can lower your risk through proper management of specific potential problems. To manage your risk:

  1. Identify all potential risks from your project information and compile into one list. Risks typically fall into the categories of risk associated with scope, schedule, resources and quality. One area already known to be high-risk in a data warehouse project is data quality (i.e., poor data quality in the source systems being integrated into the warehouse).
  2. Analyze the risks by assessing the impact to the project of each risk (e.g., high, medium or low). Assess the likelihood that the risk will occur (high, medium, low or a specific percentage that estimates the probability of occurrence for each risk). Determine if it will be easy, difficult or not possible for the problem to be detected in advance. For example, poor data quality will have a high impact on the warehouse and has a high likelihood that it will occur. Detection of the quality problems will require specific action to be found in advance.
  3. Prioritize the project risks using the results of your analysis. Manage only those risks with the most potential to damage the project. Data quality should be prioritized as a risk to be managed.
  4. Manage those high-potential risks by: 1) preventing the causes where possible or 2) developing contingency plans to deal with the effects. Incorporate the prevention tasks or the contingency plans into your overall project plan. For example, you can prevent many problems associated with data quality by incorporating data profiling tasks early in the project plan and using the results to do clean-up in source systems where possible or develop appropriate transformation rules.
  5. Continue to reassess the risks throughout the project.

Clay Rehm's Answer: Your business plan and urgency of need must be authored and presented by your business partners that are in such dire straits because they don't have a data warehouse. If they are really having a problem, it should be fairly easy to identify the cost and benefits to be realized.

Your project will only get number-one priority if you can prove that your organization cannot function without one. Thoughts to keep in mind include: Can the warehouse reduce expenses? Can it produce revenue that did not exist before? How can your data warehouse impact the bottom line of your company? If your data warehouse cannot save or create money, what can it do? What is the purpose and goal of it? What will be the long-term goals of having it or not having it?

I am not sure that any project can completely eliminate all risk. However, I would list the known issues and risks and how you may mitigate each one of them. Keep in mind that one way to address risk is to not have a solution at all for it.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access