"True or false: Development projects in the data warehousing arena are no different than other large systems development projects. The answer: true and false. There certainly are many similarities that require the same "tried and true" project management techniques, but it is equally important to understand the unique characteristics of data warehouse development to assure that the project management is effective. This column focuses on some of the unique attributes of data warehousing projects that have ramifications to their project management.

Data warehousing development projects:

  1. Never result in a "finished" data warehouse.
  2. Must consider data load sequencing, data cleansing and development of the data dictionary.
  3. Must have significant attention devoted to managing stakeholder expectations.

First let us look at #1 and #3 together because, in fact, the solution to the challenge of #1 is #3. Why is a data warehouse never really finished? Developing a data warehouse, by anyone's standards, is an enormous undertaking. In addition to the challenge of building a system broad in scope, business requirements are ever-changing. The proverbial moving target makes the declaration of "success" a tricky proposition. The fact that the warehouse is not finished can be a source of great consternation to those whose needs are not yet met and those who pay the bills. The project manager must make sure from day one that expectations are properly managed. Everyone needs to understand the ever-changing priorities, the time lines and the compromises. The project manager must pay particular attention to those most concerned about the unfinished warehouse.
It is important for those whose needs are not being met by the initial deliverables to understand if and when their needs will be addressed. It is not uncommon for lower priority requirements to be chronically postponed. The project manager owes the owners of these requirements an honest assessment of what it will take (resources, organizational support, etc.) to break the postponement pattern.

One-time development costs, ongoing operation costs and the hardware and training investment must be clearly understood. The estimated costs of implementing future phases need to be at least broadly understood. Although there are many unknowns that will impact future costs, project managers are advised to provide placeholder figures that can be continually refined. Above all, stakeholders need to understand that unless there is continual investment in the future, the data warehouse will "die on the vine." Project managers must devote much energy into communication to assure expectations are correctly set from the beginning and continually reinforced.

Data warehouses (or the operational data stores that feed the warehouses) receive data feeds from multiple sources. Specific reporting requirements of the data marts dictate the timing of their refresh. This, in turn, imposes requirements on data warehouse updating, the sequencing and timing of how specific data feeds are handled and load windows. Of course, one could leave to chance the delivery of the most up- to-date information to the data marts consistent with the business rules. Alternatively, it is suggested that the project manager use an understanding of the business requirements, data loading requirements and available loading windows to negotiate optimal data feed delivery schedules.

Data cleansing, or the process of assuring that certain key data elements are meaningful and accurate, is particularly important in data warehousing. By the time a specific data mart provides application-specific data, the source may not be apparent, and inaccuracies and limitations readily understood by some may be totally transparent to others. This speaks to the need for the project manager to assure that the processes are in place to clean the "garbage-in" and avoid the "garbage-out" which, particularly in a warehousing environment, might be masquerading as accurate information.

Finally, the project manager must assure that adequate time and effort go into developing the data dictionary. A characteristic of a warehousing environment is the wide variety of individuals who may have a need to use the information kept in the warehouse. The definition of what may seem to be even the most basic data may vary from user to user depending on their particular use. So, to a product manager, what may constitute the number of widgets sold last month may or may not match the accounts receivable manager's notion of the number sold. The project manager must assure that the data dictionary accurately and unambiguously defines the data so those who build data marts and applications can know exactly what they should and should not be using.

Next month I'll discuss some key actions of successful project managers.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access