The world of technology is in its infancy. Compared to other more mature professions, we are still in diapers. Roads and walls in Rome that are used today were built by engineers 2,000 years ago. The hieroglyphics on the tombs in Egypt contain proclamations from accountants saying how much grain was owed the Pharaoh. Excavations of Clovis man from caves in Chile show that mankind practiced medicine in at least a crude form as many as 16,000 years ago. A historical comparison of the IT profession to the engineering, accounting and medical professions shows that the IT profession is barely out of the womb. The origins of the IT profession go back to 1950 or so. Other professions go back to the dawn of civilization.

However, the IT profession has come a long way fast. As evidence of how far we have come, look at the IT architecture, or at least architecture as perceived by the IT professional. Figure 1 briefly outlines the advances in architecture that the IT profession has made.

Figure 1: IT Profession Advances in Architecture

Prior to 1983, there were applications. Accounts payable, accounts receivable, online and batch were all types of applications that dominated the landscape. However, around 1983, someone decided that there was a need for information, not data. There arose a need to look across the corporation, not just at a tiny application area. In addition, it was noticed that there was no historical data to speak of. Applications jettisoned historical data as quickly as they could in order to accommodate performance.

Thus in 1983 the early form of the data warehouse was born - atomic data. The need for granular, integrated, historical data opened the doors to styles of processing never before possible. With the data warehouse, business intelligence became a possibility. Without the data warehouse, business intelligence was just a theory. However, it was quickly discovered that the legacy systems in their moribund state needed more than programmers in order to create the data warehouse. The legacy environment was so frozen - so tightly bound - that a way to automatically access and integrate data was needed. Extract, transform and load (ETL) appeared in 1990. With ETL, data could be accessed and integrated from the legacy application environment.

ETL opened the floodgates for business intelligence. Beginning in approximately 1994, there were all sorts of extensions to the data warehouse: multidimensional OLAP (online analytical processing) data marts, the exploration warehouse and the operational data store (ODS). Soon, people were doing all sorts of business intelligence. With the ODS, it was possible to do real time processing where updates and transactions could be run against integrated data. With data marts, star schemas and fact tables found their home. With the exploration warehouse, the statisticians had the foundation of data that emancipated the data miner from data administrator to statistical analyst.

It is at this time that the data warehouse evolved into the information factory.

In 2000 came the Web explosion. Organizations started to use the Web environment as a separate arm for marketing and sales. At first, the Web professionals wanted to remain separate from corporate systems. However, it was quickly discovered that to be successful in the Web environment, integration with corporate systems was mandated. The connection with the corporate environment was made by straining data through a granularity manager that then placed data into the data warehouse. For data coming from the corporate environment going to the Web environment, data flowed through the ODS. In addition, at this time, decision support system (DSS) applications were being built. Corporate performance management was becoming a reality. In addition changed data capture began to emerge, and the adaptive data mart was added to the world of business intelligence. The adaptive data mart is a temporary structure that has some of the characteristics of a data mart and an exploration data warehouse.

At approximately the same time, the volumes of data found in the data warehouse were growing explosively. Placing all of the data found in a data warehouse on disk storage became very unappealing because of the fact that massive amounts of data in the data warehouse were being used very infrequently. At this point, placing data on multiple physical storage media became attractive.

Enterprise application integration (EAI) made its appearance as a back-end mechanism for transporting application data from one application to another. EAI focused on the speed of transmission and the volumes of data being transmitted. Little or no integration of data was accomplished by EAI.

In 2004, more refinements to the information factory were added. The two most prominent features were the virtual operational data store (VODS) and the addition of unstructured data. It was a feature that allowed organizations to access data on the fly, without building an infrastructure. It is very flexible and very fast to construct. However, the VODS offers answers to queries that are valid only as of the moment the query is made.

Unstructured data began to be combined with structured data and an entirely new world of applications was possible. For the first time, corporate communications could be combined with corporate transactions. The picture that was painted was much more complete than anything previously created.

Other additions to the information factory included archival data. Archival data complemented near-line storage and allowed organizations to manage far more data than ever before. Managing the traffic between the data warehouse and the near-line/archival environment was the cross media storage manager (CMSM). As the probability of access of data fluctuated, the movement of rows of data was accomplished by the CMSM.

Supplementing the unstructured environment was the unstructured visualization technology. Unstructured visualization was the equivalent of business intelligence except that unstructured visualization was for textual data, while business intelligence was for numeric data.

In addition, it became apparent that a monitor for the data warehouse environment was needed. The monitor for data warehousing was fundamentally different than the monitor for transaction processing.

The world of data exploration began to mature into a world of data mining and data exploration, where there are subtle yet real differences between data mining and data exploration.

All of this architecture came under the name of the information factory or the corporate information factory (CIF). The CIF is a living organism that is constantly growing and changing. Each new advance in technology causes some change to the CIF.

Looking at the growth of the CIF from 1980 to 2005 is like watching man evolve from creatures swinging in the trees to today's modern family - with cars, TVs, homes, hot and cold running water, and fresh food readily available at the marketplace. The only thing that is certain is that the evolution of the corporate information factory will not stop where it is today.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access