Data Warehousing and National Security

  • December 01 2001, 1:00am EST

This month's column is co-authored by Bob Terdeman.

The data warehouse was born into a world of commerce. Because market share, profitability and enhancement of revenues are mission-critical objectives, data warehousing has thrived and survived. In becoming part of the business establishment, data warehousing has lowered the cost of information and greatly shortened the amount of time required to access that information. The tremendous advantage in the access to and the processing of information that is afforded by data warehousing has been the cornerstone of competitive information systems in many arenas including telecommunications, retailing, banking/finance, insurance and pharmaceuticals.

However, the advantages of data warehousing are not limited to commerce. Data warehousing offers many special advantages to those who protect our country. Data warehousing offers the ability to manage information over time. The retention of historical data is basic to every data warehouse. Data warehouses are designed to remember information – one of the many requirements for national security.

The second advantage of the data warehouse for national security is the ability to handle large amounts of data. The compilation and integration of records about citizens and visitors to our country would generate a great deal of data. In some cases, the data needs to be placed on massive amounts of disk storage. In other cases, data needs to be arranged in a hierarchy where some data is on high-performance disk storage and other data is placed on supporting overflow storage, yet still available for analysis. The ability to store and process massive amounts of data must be tempered with the cost of storage in the configuration of the data warehouse.

Prior to data warehousing it was not possible to keep track of huge amounts of details, and corporations could only build what were called open-loop systems. However, closed-loop systems became possible with data warehousing and the ability of technology to keep track of millions and millions of individuals. In a closed-loop system, the information about each and every customer is stored and is available for recall and analysis. The same is true for national security. With data warehousing, details can be kept and carefully examined.

The third advantage of a data warehouse is that a data warehouse operates on integrated data. This requires infrastructure and architecture and is undoubtedly the most difficult aspect of data warehousing, but it is also one of the most important aspects. Once the architecture has been built and the infrastructure constructed properly, the ability to relate activities and events that have seemingly no relationship or at most a very casual relationship becomes possible.

From the standpoint of national security, integration can and should be performed across a wide variety of systems and sources. The data warehouse is designed for flexibility and is able to accommodate integration of data from a very wide variety of sources, a previously impossible feat. A massive network of stovepipe systems is replaced by a data warehouse-centric architecture that ultimately costs less and is infinitely more powerful.

Integration of data in a national security data warehouse is achieved by means of a different kind of database design than would normally be found in a standard commercial data warehouse. Data in a national security data warehouse needs to be related by means of a pass-through key. A pass-through key is a key that has meaning only within the context of the data warehouse. It is similar to the record locator number that airlines use for each ticket that has been issued or for which a reservation has been made.

The value of a pass-through key is that it allows individuals and organizations to be called anything or multiple things without losing track of the true identity of the individual or the organization.

The fourth advantage of a data warehouse for national security is that it makes data easily and quickly available for analysis. In the commercial sense, this usually means making data available for transactions or for multidimensional processing. For the purposes of national security, it primarily means making data available for data mining and the sifting through of millions of occurrences of data. There are, of course, uses of the data warehouse where sweeping reports will be made. Commercial operations do this kind of processing all the time. However, the major value of data warehouses insofar as national security is concerned is that they can allow the analyst the opportunity to look at things on a detailed basis.

Data warehouses universally require infrastructure and architecture. National security data warehouses are no different in this regard than any other kind of data warehouse. The architecture for a national security data warehouse requires that:

  • Data be able to be gathered from multiple and disparate sources.
  • Data be integrated (the most difficult part of the architecture) into a single physical structure.
  • Huge volumes of data can be handled.
  • Data be prepared for analysis.
  • Data be designed to be at the lowest level of granularity at the point where flexibility is most important.
  • Data be designed with pass- through keys so that changes to external data will be picked up by the system.
  • Data be prepared for long-term viability of the physical security of the data warehouse.

