The System of Record in the Global Data Warehouse

Register now

The notion of the system of record has long been one of the cornerstones of the data warehouse environment. The system of record is the place where there is a definitive value for some unit of data. It is intuitive and dates back to the days of transaction processing systems. The system of record for the banking environment states that you have your balance for your account in exactly one place. If you have no system of record for your bank account or if you have multiple systems of record for the same account, something is fundamentally wrong. The same concept applies to many other systems, such as insurance policies. The insurance company has one and only one place where there is a system of record for your insurance policy. The company that sends you catalogs in the mail has (or ought to have!) only one place where you are known as a customer, and so forth.

Where there is a single system of record, there is integrity of data. Where there is no system of record, there is no integrity of data. For this simple reason, the system of record is one of the most basic concepts of the information systems environment.

Consider what happens when there is a global data warehouse. A global data warehouse is a data warehouse with multiple supporting local data warehouses. For example, there might be a headquarters data warehouse in New York City and different country data warehouses in France, the U.S., Hong Kong, Saudi Arabia and Canada. Or, there might be a central global data warehouse in Chicago, and a data warehouse for manufacturing in Dallas, a data warehouse for sales in Detroit and a data warehouse for distribution in Duluth. In each of these cases, there is a need for central data and a need for local data. In turn, there is a need for integrity of data across both local and global warehouses.

The question naturally arises: Where is the system of record? Is it at the global data warehouse, at the local data warehouse or somewhere in between? If there are multiple data warehouses, doesn't that destroy the concept of the system of record? Not at all. In the face of global data warehouses, the system of record distributes over multiple locations in accordance with the business function being served by the global data warehouse.

In order to show how the system of record becomes distributed in the face of local and global data warehouses, an example is in order. Suppose there is a global data warehouse for multiple lines of business. The global data warehouse resides in Chicago, the manufacturing data warehouse in Dallas, the sales data warehouse in Detroit and the distribution data warehouse in Duluth. The data found in the global data warehouse in Chicago is probably only financial. The only business connection between manufacturing, sales and distribution is at the dollar level. There is no attempt to create a commonality of data between sales and manufacturing. Therefore, the information found in the global data warehouse does not challenge the system of record because the system of record for different kinds of data resides at the local level.

However, there is a need for financial data to be collected at the global level. The financial data is first collected at the local level. Then, after collection, the data is passed to the global level. However, it is worth noting that in passing from the local level to the global level, the definition of data and its classification often change. For example, the global data warehouse may have a different definition of revenue than the local data warehouse. There are many reasons why a basic transformation of data takes place as the data passes from the local to the global data warehouse. This basic transformation explains why data in the global data warehouse does not add up to the simple sum of data from the local data warehouse; there are different systems of record. There is the local system of record, which is based on the understanding of business as the local operation sees business. Then, there is the global system of record, which operates on a different set of business rules. Additionally, there is the transformation, which defines how one system of record relates to another.

In many cases, the granularity of data changes as the system of record changes. The granularity of data at the local level is lower than that at the global level. For example, at the local level, "sales" means a record of each individual sale. However, at the global level, "sales" means sales by country accumulated across all of the individual records. If you want to find the details of a particular sale, you go to the local data warehouse. If you want a global perspective of all sales, you go to the global warehouse. In this sense, the system of record for a global data warehouse/local data warehouse environment is split.

A local/global mapping is required to keep track of the transformations that occur as data is passed from one level to another. The local/global mapping is one that identifies the system of record locally and globally, and the transformations of the data as it passes from one level to the next. In addition, the mapping is time-variant; the different states of the mapping are kept as they change over time. The relationship between local mapping and global mapping is kept in this manner.

For reprint and licensing requests for this article, click here.