We are pleased to welcome Michael Haisten, Vice President of Information Architecture Services at Daman Consulting, as one of dmreview.com’s on-line columnists. He will be providing expert advice on data warehousing and enterprise application integration solutions. Haisten has been a leader in the information management and architecture disciplines for 22 years. He has served as chief information architect at Apple Computer Corporation, and he has led the design and development of more than 65 data warehouse and decision support projects. In addition, Haisten has published extensively on data warehouse planning, data access facilitation and other key aspects of data warehousing.
If you have any feedback, questions or suggestions about Haisten’s column, please contact Rachel Rasmussen, Web Editor, at email@example.com.
Editor’s Note: This article is the first in a three-part series discussing data warehouse evolution.
The concept of data warehousing is older than most people realize. The road from there to here has been rough, to say the least. The initial expectations turned out to be unattainable (enterprise integration). Our fallback position was too tactical (data marts). The options grew to be too diverse, too ambiguous and downright confusing.
Happily, we have entered a period of synthesis in which a new consensus on design is emerging. This is a good thing because we are about to embark on possibly the most radical path yet real-time data warehousing is within our grasp.
The bulk of this three-part series is a review of data warehouse history. The intent is set the stage for a series of discussions on the future of data warehousing and enterprise architecture.
Architectural Beginnings (1978-1988)
The origin of the data warehouse can be traced to studies at MIT in the 1970s which were targeted at developing an optimal technical architecture. At the time, the craft of data processing was evolving into the profession of information management. For the first time, the MIT researchers differentiated between operational systems and analytic applications. Their intent was to develop architectural guidelines for developing new solutions from the ground up. A core principle that emerged was to segregate operational and analytic processing into layers with independent data stores and radically different design principles.
One contributing factor was the extremely limited processing and storage capacities that existed at the time. This helped justify the Information Center 1 phenomenon in the 1980s that was motivated strongly by the desire to off-load the new and wildly unpredictable analytic load from the transactional systems platform. Fortunately, though, the heart of the argument was that these two forms of information processing are, and by their very nature would always be, very different from one another so different that they require unique architectural designs.
In the mid- to late-1980s, Digital Equipment Corporation (DEC) was one of the most technically advanced companies on the planet and was the first to build a distributed network architecture to host their business applications. DEC was also the first to migrate to a relational database environment using their own DBMS product, RdB.
DEC planned to be the showcase for a new approach to application architecture. They pulled together a team from many different disciplines including engineering, marketing and finance as well as information technology groups. The mandate was not only to invent this new architecture but also to apply it to Digital’s own global financial systems. The architecture team proceeded to combine the MIT design principles with their newfound expertise in networks and relational databases to create the Technical Architecture 2 (TA-2) specification. TA-2 defined four services categories: data capture, data access, directory and user services.
Data capture services are the operational components of the MIT model, and the data access services are the analytic component. Each of these services would be implemented as one or more distributed applications interacting with segregated data stores on a worldwide network. The TA-2 team added directory services to help people find what they needed on the network and user services to support direct interaction with the data.
User services contained the human-machine interfaces for all the other services. This was the most novel twist in the architecture. For the first time, the interface was defined as a standalone program interacting over a network with multiple independent data service providers. It is ironic that Digital Equipment Corporation, the company that said personal computers would never find a home in the enterprise environment, was the first to implement client/server computing on a large scale.
Enterprise Integration (1988)
Meanwhile, IBM was tackling a different aspect of the information management problem. One of the most vexing problems reported by their customers was the growth of information silos. Whether due to decentralized operations or mergers or changes in business model or whatever, companies faced the daunting task of integrating data from many separate systems with dissimilar coding schemes.
In 1988, Barry Devlin and Paul Murphy of IBM Ireland tackled the problem of enterprise integration head-on. They introduced the term "information warehouse" for the first time and defined it as: "A structured environment supporting end users in managing the complete business and supporting the (information technology department) in managing data quality."2
IBM still knew a good thing when they saw it. Information Centers sold a lot of boxes in the 80s. Maybe this information warehouse could be their ticket to renewed growth in the 1990s. But IBM failed to follow up their intellectual lead with anything but marketing rhetoric. Where was the architecture? How do you go about building one of these things?
Between 1988 and 1991, I was a member of a cross-functional team that set out to fill this gap. Our ambition was nothing less than to build a complete design guide for application development that married the new client/server capabilities to the traditional, generally mainframe, computing environment. Rather than start from scratch, our intent was to update and expand the DEC TA-2 architecture while adding the enterprise integration goals of IBM’s information warehouse concept. We even toyed with the idea of calling our architecture TA-3 before we settled on VITAL (virtually integrated technical architecture life cycle).3
On the technology side, we had to incorporate personal computers, graphical interfaces, object oriented component environments and local area networks into the mix. Our greatest contribution, however, was to flesh out the details of hundreds of architectural components for application development in the client/server era. We specified more than 85 components in our data warehouse section alone including generic extraction, transformation, validation, loading, cube development and graphical query tools. Since you all know the sorry history of Apple Computer in enterprise computing, I am sure you can guess what happened to these leading edge designs.
By the first years of this decade, an initial rationale existed for data warehousing supported by well-developed architectural frameworks. The principle of analytic segregation was well established. The need for a clean, integrated, cross-functional view of the business was gaining ground. The base technologies including common relational access, ubiquitous networking, client/server computing and graphical interfaces with component (object oriented) software were on the way.
Bleeding-edge companies built data warehouses between 1988 and 1991. The pieces were in place for wider adoption. All that was needed was the catalyst. In the second part of this article series, I will discuss the enterprise data warehouse in the early 1990s.
1. The term Information Center can be traced to a design monograph developed at IBM Canada in 1979.
2. Devlin, B.A. and Murphy, P.T. An architecture for a business and information systems. IBM Systems Journal. Volume 27, No. 1, 1988.
3. The original meaning of VITAL was VAX IBM Tandem Application Life cycle in deference to the dominant platforms in use at Apple at the time, despite the Macintosh of course. This gives you some hint at the scope of our integration challenge.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access