Your executives have been discussing different options for helping the company maintain and increase their marketplace competitive advantage. Many of these options are data and information related, and the concept of the data warehouse has been enthusiastically endorsed as one option. Your CIO calls you in and tells you to get right on it. What do you do? As a seasoned systems professional, you know about data warehousing, having read Bill Inmon's groundbreaking definition of it in 1990, and you have followed its evolution to the present day. You are conversant on architectures and tools. You have read about successful implementations and failures. Maybe you even attended some data warehousing conferences. You have a skilled database staff that is anxious to get trained in new tools and begin what they see as a leading-edge and challenging project. But you and your staff, with vast experience in the development and maintenance of operational systems, realize at your first meeting that this project is going to be different. Really different!

To begin with, you and your team must approach end-user requirements gathering radically different than with operational systems. In developing an operational system, you would follow the traditional system development life cycle (SDLC) route with interviews, extensive research, process analysis and other tasks appropriate to structured development. A large percentage of the requirements would be gathered prior to any development and a sign-off with the user would insure a minimum of scope creep. A data warehouse requires a broad-brush requirements gathering effort that is more iterative in nature.

When beginning a data warehouse project, there is rarely a firm and succinct understanding of what the users want when you first sit down with them. They know they want a repository of the data they regularly use. They know they want easier access to the data they use intermittently, from both internal and external sources. They know they want to be able to report the data more easily. They know they want to be able to model the data without multiple downloads to their favorite spreadsheet. They want accurate data. But how? That is left up to the developer. Bill Inmon states that if 50 percent of the first iteration of the warehouse design is correct, then the design effort has been successful. With this in mind, let's discuss a general model for data warehouse requirements gathering that will meet or exceed this measurement.

A significant prerequisite to data warehouse requirements gathering is the construction of a data model which supports the attainment of the business benefits noted earlier. If one is not available, one must be built. Note that this model is what the business needs, not what it currently has, and is built with no proviso for technology. This model should reflect the major areas of the business and the relationships that exist between them, with the data interrelationships and attributes defined.

With the data model in hand, you can now compare it to the data that is currently available to the business--what Bill Inmon calls the system of record or the "mess." Your data model can serve as a benchmark comparing what the best data is to the best data the business already has. You will readily see if any data measures up to the data identified in the data model. There may likely be none; but, in some cases, depending on the accuracy, the completeness and the timeliness of the delivery, there could be multiple data that correlates to the data model.

With the data model complete, you now begin the effort of identifying who to talk to. This may include a variety of people, ranging from accountants to executives to IT professionals. But the most important group will be the eventual users of the system who will be the primary source of what data is needed and how it is used. These people all need to be identified and contacted to participate in the requirements gathering process. All of these people will have differing views of what it is you are there for, and it is incumbent on the developer to establish highly ethical professional relationships with these people and not be lobbied in one direction or another.

Concurrent with this effort will be the identification and acquisition of any documents that spell out what the customer needs. These could range from a formal study, a statement of work, a request for proposal or similar document. In a perfect world, these documents would be the problem description and leave little for you to do. However, in most cases, this description will be little more than an overview. This review of documents pertaining to the development effort may yield little of value, but it is essential to gaining insight into the project.

At this point the "meat" of the requirements gathering begins. Having identified the key people to talk to, you now arrange for an interview to take place. Again, the scope and the number of interviews may not equal that necessary for an operational system, but a good cross section of those noted earlier will need to be interviewed. There are a number of tricks to interviewing, but essentially it comes down to the following basic points:

  • Make sure you are interviewing the right person. One dissatisfied person can do a lot to sabotage your work.
  • Assure the interviewee that you are there to help solve a problem and that it can't be done without his or her help.
  • Share information. Trading information creates trust and partnership and leads to the acquisition of more information.
  • Prepare an interview script. Asking prepared questions to a number of interviewees can lead to a statistical sampling of very useful data.
  • Don't take anything for granted. If an interviewee says this system will make life great for the people in Department X, then go interview people in Department X.
  • After the interview, summarize the interview answers and check back with the interviewee for accuracy.

Interviewing takes you to a certain point in your warehouse requirements gathering; but to really get a firm handle of what is needed and desired, much more is necessary. One activity which is particularly useful is participation. There is, in most cases, a system or architecture which is currently in place and which is not satisfactory to the enterprise or department's data needs. If there is, sit with some users for a day or two and work within the system(s). Experiencing the environment in which the users work has two distinct benefits. One, it allows the developer to test-drive the systems and experience the problems associated with it. Secondly, it allows the developer to get closer to the users and does a lot to insure ongoing cooperation from them.
An equally critical component of the warehouse requirements gathering process is a facilitated session or sessions with the users and the development team. These Joint Application Design sessions, or JAD sessions, do much to reconcile the multiple data gathered from the interviews, systems participation and document review. Each of the preceding efforts might yield separate problems rather than different facets of the same problem. Bringing together key participants in a structured environment under the direction of a trained facilitator can solidify the problem definition and lead to the production of a problem specification which will serve as the foundation on which to build the warehouse.

Here is where the difference between an operational system and a warehouse shows itself most vividly. An operational system would still need much more of everything--more time, more data, more sweat. However, with the effort you have expended up to this point, you can now begin the construction of your data warehouse. You have gathered enough requirements to select a subject area that is large enough to be meaningful and small enough to be able to be implemented. The business benefits derived from this initial build will be measurable with a good probability of success. The feedback loop will be set in motion as the users access the warehouse and start the collaborative process of growing it with larger and more complex subsets of data. The problem specification that has been created may be somewhat thin when compared to one for a tax accounting system, for example, but it will probably be fine for the start of the warehouse. Thus, the data model and problem specification will now serve as the foundation on which successive iterations of the warehouse will be built.

A data warehouse project is a lifestyle change for both the developer and the user. The developers must adopt a more business-oriented approach at the outset of warehouse development by putting themselves in the shoes of the user. As mentioned earlier, you want to start by identifying requirements whose objectives elucidate exact, attainable business benefits that can be quickly seen and appreciated. To do this, the developers must understand the business and how the users will analyze the information you are going to provide to them. A warehouse is designed to deliver this information.

From this point forward, requirements come to you in stages as the users discover more uses of the information in the warehouse, and the warehouse grows with negligible interruption to the existing systems environment. Because of this focus on broader business needs, data warehouse development is more akin to business process reengineering concepts than to operational system design. And, much like ascending Jack's beanstalk, the end of the climb is never clearly in sight. The warehouse becomes a living, breathing entity that must be fed and nurtured over time to remain alive and productive. The warehouse ultimately reaches adulthood when it achieves recognition as a vital and indispensable piece in the information gathering process.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access