Continue in 2 seconds

Data Warehousing in Government

Published
  • May 01 1998, 1:00am EDT

Every large organization, in business or government, has large quantities of data that have been collected over a period of years. Most large corporations recognize that there is value in this historical data and have undertaken projects to build data warehouses to make the data accessible in a meaningful and timely manner.

Government agencies have every bit as much data sitting in their libraries as any large business. Only a few government agencies have begun to tap into this data in any useful way. Why is this? Surely the people who manage government programs have just as much need for information about their organization as their peers in the private sector; why are they behind the curve in this area?

Let's examine a few of the ways in which data warehousing can go astray--ways that are somewhat unique to the government environment. Among the topics for discussion are project requirements, data ownership, the procurement process and ongoing support for the warehouse.

If the purposes of the data warehouse were well planned prior to procurement, the process of defining and building the warehouse should be no different than in any other kind of organization. If, however, the process began with buying a box and then deciding what data to put on it, things are not so clear. Organizational agreements and boundaries can be very difficult challenges.

Whose Data Is It?

Questions of data ownership, data security and confidentiality are more likely to plague data warehousing projects in government than in the private sector. Your auto insurance company has information about your driving record, but no one suggests that they should make that data available to the public. On the other hand, there are often calls for governments to make available a wide range of information based on their collected data. These calls for public access to data arise in widely divergent areas such as teacher accountability, hospital death rates, driver records, environmental hazards and airline safety statistics.

The conflict that arises over access to public data boils down to the fact that public data is collected and stored with public funds, yet individuals and corporations continue to have the right to privacy. That conflict has already been underway for many years and cannot be resolved here. Data warehousing only raises the debate to a new level, because now it is possible to make data available directly to the public.

Fears over data security can impact data warehousing projects in government even when there is no intent to make the data available outside the agency. Even within agencies such as state departments, there is often conflict about who owns particular sets of data and who should have access to the data. Sometimes this conflict is simply due to turf protection by management, but other times it is set in law. For example, one state recently implemented digitized images for its driver licensing system. Yet, by legislative mandate, these digital images can be used for one purpose only: to create a driver's license. The image cannot even be made available to a police officer for the purpose of verifying that the photo on a license is the correct one.

This problem can be addressed in several ways. Intra- or inter-agency agreements on confidentiality can sometimes satisfy security concerns. The technology itself provides the answer through the use of database views or through the design of selective data models at the query-tool level. If aggregated data is sufficient to satisfy user needs, identifying information can be omitted and confidentiality preserved. There is no single solution to this issue because there are so many potential variations in the needs and limitations encountered, but security should not be used as a deterrent to the development and deployment of the warehouse.

In developing a set of requirements for a data warehouse, it is much easier to focus on the technologies and ignore the needs. Too often, procurement is based on the hardware or DBMS the IS department prefers. This is not the place to begin a data warehousing procurement, but it is the easiest place to start. This problem, incidentally, is not unique to government agencies. It may, however, be more common in government, especially when the particular agency falls under a "master contract" which specifies which approved technologies must be used.

Before deciding which hardware is the right choice, it is important to understand what the data warehouse will do. Who will use it and why? What user communities have an interest in the data, and what do they plan to do with the data? What data do they need, and how many months or years of it do they want? Who owns the data? Have they agreed to have their data placed in a warehouse?

While these seem like fundamental questions, it is not uncommon for a procurement to document in painful detail the technical requirements of the hardware and the functional requirements of the DBMS, but nowhere explain why the warehouse is needed in the first place.

Why tell vendors what you plan to use the data warehouse for? Because while all vendors are in business to make money, most also want to help their customers succeed. They can only do that if they understand the customer's needs. Making the vendor a part of your project and asking the vendor to suggest the best approach to reach your goal gives your agency a much better chance of success.

Procurement Woes

It has long been a challenge for government agencies to keep abreast of current technology, largely because of the intricate procurement rules, lengthy review process and funding issues surrounding large purchases. Data warehouses are, for the most part, large purchases in terms of both dollars and time. While corporations can, if they choose, simply buy the product they feel is best, government agencies seeking to build a data warehouse will generally have to issue a Request for Proposal (RFP).

Since this is not an article about reforming government procurements, we will make no attempt to describe all of the potential review-and-approval pitfalls that can snag a well-intentioned data warehouse procurement. However, those who have never worked in government procurement may be amazed at the intricacies of the process.

A state agency trying to procure hardware, software and/or services for the construction of a data warehouse may need approval from any or all of the following:

  • Departmental planning or purchasing
  • Department management
  • End-user staff
  • State purchasing authority
  • State CIO
  • Legislative committee
  • Legislative funding authority
  • State funding authority
  • One or more federal agencies
  • Public oversight committees

Additionally, the approval of each of these entities may be needed multiple times. For example, some state-level procurements which involve federal funding must first prepare an advance-planning document for review and approval which describes the technology to be acquired and its intended uses. This is followed by review and approval of the RFP document which will be issued to prospective vendors. Either or both of these documents may be recycled several times as recommended changes are incorporated. It is not unusual for a single document to be in the review process for a year or more.
When an RFP is actually released, bids from competing vendors must be evaluated and scored. Depending on the procurement rules, the winning vendor may be selected on price, compliance with requirements or other objective criteria. Rarely is a contract awarded solely because a vendor offers the best technology or because of the past success of a vendor's implementations.

Once a vendor is selected, it is very common (again, depending on the agency or state) for one or more losing vendors to file a protest or appeal, which can delay the project for months or even years. (Protests are a favorite ploy of incumbent vendors; since even if they ultimately lose, they continue to collect maintenance and licensing revenues while the appeal process slogs on.)

After all of this, when a vendor is selected and the project begins, it is not uncommon for the hardware or process selected to be approaching obsolescence. The project may then be halted for a significant period of time while the newest products and services are decided upon.

Another challenge to be faced in government data warehousing is that governments tend to think in terms of finite projects: requirements are developed, programs written, a solution implemented and that's it. A data warehouse doesn't work that way. Like a new puppy, the acquisition is only a small part of the total challenge. To be successful, a data warehouse requires ongoing attention both from operations staff and from analysts and business users. Current data must be added regularly, but more important is the ongoing evaluation of new requirements and opportunities for additional data sets to be brought into the warehouse. As a result, when funding authorities are unable to understand or support this need for continued growth, developers are faced with uncomfortable alternatives. One alternative is to buy a system that is much too large for the initial implementation. The other is to buy small and force additional users and data to wait until enough time has passed or demand has accumulated to justify an upgrade.

Is there a way out of this dilemma? Probably not, again because each case is unique. The safest bet is to buy as much as you can possibly justify initially and hope you can get more when you need it. The consolation is that a successful data warehouse does a lot to justify itself. If user demand is truly that--users demanding access to the system--then users can be a major asset in justifying upgrades. Make them your allies as you grow the warehouse and let them help you in your justifications. They may be able to document cost avoidance or fund recoveries that make system expansion an obvious choice.

Can It Ever Work?

All of these challenges may lead to the conclusion that data warehousing can never be successful in the public sector. That's not the case. At both the federal and the state level, several data warehouses are already in place and providing value. Data warehousing is used in revenue, human services, law enforcement, health and numerous other types of agencies. When implemented properly, the ultimate beneficiary of government data warehouses is the only customer governments really have: the taxpayers. In other words--us.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access