This month's column was contributed by JT Taylor, product manager for Software AG's Enterprise Information Integrator.

Recently a new type of integration has emerged: Enterprise information integration, or EII. Information integration is a separate and distinct type of integration when compared to application integration, but why? What about data integration? How is that different? Or is it? These are the questions I hope to answer here.

First, we should define EII. EII is the integration of data from multiple systems into a unified, consistent and accurate representation geared toward the viewing and manipulation of the data. Data is aggregated, restructured and relabeled (if necessary) and presented to the user. We'll come back to this definition after we take a look at other types of integration and how they compare with our definition.

Data integration has been around for the longest period of time. Data integration is the extraction, transformation and loading (ETL) of data from disparate systems into a single data store for the purposes of manipulation and evaluation (reporting). Data warehouses and data marts are the data stores, and ETL tools are the "data integration" components.

What is required to support this type of integration is a thorough analysis of participating systems and data to determine the relevant data to be extracted, the transformation steps necessary in order to "cleanse" the data and the destination data structures in which to load the data. Reporting is conducted via analytical reporting tools that can find new ways of viewing the gathered data in order to create useful, decision supporting information.

There is a necessity for this type of integration - but is it information integration? I don't think so. Data integration is primarily involved with manipulation and evaluation of historical data to detect trends that are otherwise not apparent or to support "what-if" inquiries by adjusting some of the values in order to project or forecast as yet unforeseen possibilities. This is a very important type of integration focused mainly on supporting decision-makers.

Application integration, on the other hand, is focused on the integration of data among a collection of applications or systems. As data is changed in one system, the change is propagated to other systems of interest, usually via asynchronous messaging. A few years ago the acronym "EAI" appeared to describe an integration toolset that consisted of a messaging system, a broker for routing and transformation, and a collection of adapters that eased the interfacing with applications and data from various systems.

Enterprise application integration (EAI) still exists today and is still very relevant to organizations. Keeping data synchronized across a wide collection of heterogeneous systems is certainly an ongoing challenge to most organizations. So, this too remains an important type of integration. But is it information integration? Again, I don't think so. Application integration, though required by business functions, is primarily the domain of an IT organization. Their responsibility being to keep these various systems an organization has in sync with each other.

As stated earlier, information integration is focused on the integration of data from multiple systems into a unified, consistent and accurate representation geared toward the viewing and manipulation of that data. Information integration is targeted squarely at end users who are required to deal with multiple systems in order to perform their given tasks.

Providing a unified view of data from disparate systems comes with a unique set of requirements and constraints. First, the data should be accessible in a "real-time" fashion - meaning that we should be accessing the systems directly as opposed to accessing stale data from a previously captured snapshot. Second, the semantics, or meaning, of data needs to be resolved across systems - this is the consistency I referred to earlier. Different system may represent the data with different labels and formats that are relevant to their respective uses, but that require some sort of correlation by the end user in order to be useful to them. Duplications are removed, validity is checked, labels are matched and values are reformatted, etc. - all usually performed manually (or mentally) by the end user on demand.

Figure 1: Definitions

How is real information integration achieved? We believe that it starts with a service- oriented architecture (SOA). This provides a universal access mechanism to all systems via Web services and a universal data representation via XML. Also, this allows access to data not conveniently located in a database - commercial packages, custom applications, Web content, documents, images, feeds, etc. Having an SOA as a foundation supports the integration and development of information from structured, transactional systems as well as unstructured, content-based systems.

Beginning with the meta data describing access and representation, we can build an information model of an organization's information set containing relationships and rules that represents the semantics of the data and it's interaction with other data and processes. This model is best represented with the Web Ontology Language (OWL) from the Semantic Web group of the W3C. While the Semantic Web may be years away from attainment (if ever), a model-driven enterprise is achievable today. By creating an active model of data entities and mapping those entities to their respective sources exposed as Web services, true enterprise information integration is finally realized. And the catalyst comes with the word "semantics." Semantics gives meaning to something. Data with meaning is information. Data without meaning is nothing but a collection of bytes or characters.

Figure 2: Keys to Information Integration

Meaningful information is achieved by describing relationships between data and the rules that govern its use. This has always been described as "business logic" and has previously been captured solely in the programming code of a system - which inherently means that it is non-existent between systems. These relationships and rules, or business logic, can now be captured directly in an information model using OWL. An OWL inference engine is then capable of reading a model described in OWL, reading in data instances that populate the model, and then acting as an intelligent integration engine that can provide a unified view of information by performing all of the "cleansing" and "correlation" previously done by hand (or not at all). Duplicates do not have to be removed, values do not have to be adjusted, and everything is left in its originating system.

This powerful concept of semantically modeling information and mapping the model to the underlying infrastructure via a service-oriented architecture is one concept of enterprise information integration. This loosely coupled system, built by knowledge workers instead of programmers, is the future of information integration.

John (JT) Taylor is currently the product manager for Software AG's Enterprise Information Integrator. He has been involved with integration projects since the mid-1990s and information technologies in general for more than 20 years Taylor began his career in commercial software in Research and Development where he has designed and developed five products. After switching to a marketing role, he has been involved in the product marketing and evangelizing of two integration product lines. Software AG is a founding member of the Integration Consortium (IC). The company also established and Chairs the IC Enterprise Information Integration (EII) Committee.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access