Sometime in the early dawn of the information processing age, where mists rose from the miasma of the swamps of early systems, an early information technology caveman looked up and saw that the carefully crafted applications of the day had formed a big mess. There were many difficulties and challenges associated with the early formative systems during the ice age of technology.
One of those difficulties with the early systems was that data was everywhere. The same data could be in 15 different places, and no one ever knew what was the right value. Trying to make a rational business decision based on fifteen different values for the same unit of data was a nightmare.
Along with the many other challenges presented by the early systems, early information technology cavemen decided to do something about this state of dis-integrity of data.
The first reaction was to proclaim: Down with redundancy! Indeed, it appeared that redundancy of data was the culprit.
However, minimizing redundancy was not the answer. Redundancy is all around us in real life. Take time, for example. You see the time displayed on television, you hear it broadcast on the radio, it is displayed on bell towers, on wristwatches, stopwatches and so forth. Time is massively redundant across the world.
Stock market quotes provide another example of redundancy. Stock market quotes are in the paper, on ticker tape, on the Internet, in the Wall Street Journal, on TV and in many other places. In fact, when you stop and think about it, redundancy of data is practically everywhere in real life.
Thus it is with data in the corporation. The same data is found in master files, screens, statistical analyses, databases, on the PC, on the mainframe, in the DBMS, in enterprise resource planning (ERP) systems and elsewhere. However, elimination of data redundancy is simply not realistic or productive.
Therefore, rather than try to depose redundancy, man was faced with the prospects of how to live with redundancy, given that it is a fact of life. Early information technology cavemen decided that what was needed was a system of record. The appealing notion of the system of record comes to us from the banking and financial environment.
How is it that a large bank can cope with numerous accounts, customers and balances of different kinds and face no problem keeping track of how much money is in an account? The answer is that the bank has a rigidly defined system of record for its accounts and its balances. In the bank, the system of record is the one and only place where a customer's account balance is kept. The account balance may be used elsewhere for reporting and other purposes; however, the account balance is only updated and reconciled in one place. When a customer comes into the bank and has a problem with his or her account balance, the bank knows where to go for account balance information. There is no confusion on the bank's part as to where the integrity of the data lies.
Now we know how early information technology cavemen solved the problem of dis-integrity of information. Or do we?
Fast-forward to the new millennium. Instead of a bunch of applications, the world has large and complex architectures - the corporate information factory and the government information factory. What happens to the system of record when a large architecture such as the corporate information factory is considered?
In the corporate information factory or government information factory, there are many different kinds of architectural entities - data warehouses, data marts, operational systems and the like. Depending on the organization and its maturity, there are many different forms of data.
Consider that data is dragged across this landscape of data. Data enters the architecture at the point of operational systems. Data is then moved to a data warehouse. From the data warehouse, the data finds its way into DSS applications, data marts, exploration and data mining warehouses, ODSs, near-line storage and elsewhere. What happened to the system of record? Has the corporate information factory or government information factory and architecture usurped the concept of the system of record? Has the fact that data is being proliferated across the corporate information factory or government information factory displaced the system of record? Do we now have data dis-integrity?
The answer is not at all. Indeed, one of the great selling points of the corporate information factory or government information factory is that there is data integrity. However, the corporate information factory or government information factory requires that the concept of the system of record be stretched.
What happens in the corporate information factory or government information factory is that the system of record moves as data passes through its life cycle. When data first enters the corporate information factory or government information factory environment, it usually enters in the form of operational data captured and managed in an operational or transaction-based system. At the moment of entry and as long as the data is used operationally, the operational system holding the data becomes the system of record.
However, data does not reside in operational systems forever. At some point in the life of the data, the data ages and needs to be placed in the data warehouse. The data is no longer needed for operational processing and finds its way into the cavern of the data warehouse where historical data is kept. Now, the data warehouse becomes the system of record. The data has passed from active usage to passive usage, and the data warehouse has become the system of record.
Time passes and the data in the data warehouse ages. At some point in time, data needs to be moved from the data warehouse. At this point, the probability of access of the data has diminished greatly. The data still has potential use, but the placement of the data in the data warehouse is questionable. The organization does not want to throw the data away; yet, at the same time, the organization does not want to keep the data in actively used storage supporting the data warehouse.
This is the point where data is placed in archival storage. As data is placed in archival storage, the system of record shifts again. The archival environment becomes the system of record.
Thus, the system of record can be in one of three places in the corporate information factory or government information factory - the operational environment, the data warehouse or the archival environment.
It is noteworthy that the system of record applies only to granular data. Derived, summarized or aggregated data has its system of record in the granular data used in the calculation or derivation of the data. The system of record for summarized or derived data changes as often as the wind blows. The real issue is not where the system of record for summarized data or aggregated data resides, but where the system of record for the granular, detailed data resides.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access