Enterprise information integration is the latest buzzword; it is largely associated with federated databases by the database technology vendors. Not to be confused with enterprise application integration, which integrates applications, EII is slated to integrate information. Information is supposed to provide meaning to a collection of data and will be used in a variety of ways. If we use the word "information" which is not merely equivalent to "data," EII is supposed go beyond simple data integration. Let's take the scenario of simplified data integration of two distinct types:
- Bulk data integration which is associated with aggregation and extraction, cleansing and loading, is the backbone of both tactical (operational data store - ODS) and strategic (data warehouse) decision making. Traditionally this is achieved by custom scripts (PL/SQL, stored procedures, shell scripts) and ETL products. This type of integration is not in real time.
- Non-bulk data exchange and synchronization in real time or near real time for faster communication of data between systems. This helps to increase the speed of business since data is synchronized as soon as it changes and a single version of the truth is possible. This is in the arena of EAI, where the "adapters" sense changes in data and the "broker" propagate the data to relevant systems. The transformation and cleansing is expected to be less data-intensive and can be applied either on pieces of data or "canonicals" by program logic.
In recent times, itthe feeling is that two types of integration are not sufficient. There are special cases when something beyond data integration is required - where two systems have more or less same information but with some variance have to be transacted on in real time. This scenario requires systems with abilities to identify of the variations in real time, make decisions about what data has to be transacted upon and integrate this business process with the existing business processes with minimal disruption.
Need for Information Integration
Real-time identification, aggregation and differentiation of similar data have gained tremendous importance in the recent past in three fronts:
- In customer relationship management (CRM), the customer touchpoints have increased. From Web to contact center to IVR to WAP to e-mails, the customer data is scattered in many places. This data needs to be synchronized, organized and linked. Typically, the different systems that handle the customer information are built independently and without a strong data strategy to connect them. The customer data is typically synchronized in non-real time for analytical purposes. But integration and linking of data for real-time usage is still not a reality. In banking, which is a technology-savvy industry, many times all the accounts of the customer are not linked for obtaining a real-time information; therefore, obtaining a 360-degree view of the customer becomes very difficult.
- In manufacturing or retail industry the product data or item data which represents the key business entity is processed by different departments differently. In a retail scenario, a buying department needs the item information linked with vendor and store information; a warehouse department needs the same information linked with shipping and consignee information. It is not uncommon in the manufacturing industry to find products categorized differently, even if they contain similar part numbers. Thus multiple views of the same information must be processed by different stakeholders.
- In business, adoption of industry standards is gaining importance. In retail, GTIN and UCCNet demands cleaner representation of enterprise data. Existing data which is used by a multitude of systems cannot be changed. The only alternative is to create a unified view of the already available data which then can be used for new demands for standards compliance. Similar demands are created by introduction of new technologies such as RFID.
The main roadblock for implementing solutions is that the amount of disruption might be very high. The situation is changing with the advent of EII tools and methodologies.

Figure 1: EII Helps Enterprises Understand the Data
Figure 1 illustrates that enterprises need to start understanding the data - or rather the meaning of the data. True EII should help to achieve this.
Elements of Information Integration
As discussed earlier EII goes beyond building an operational data store, data warehouse or enterprise application integration. Figure 2 shows the different elements in action.

Figure 2
The first logical component required for EII is a data hub, which performs a different function than that of a staging area. This hub hosts the meta data information about all the participating systems. Building this meta data is different question altogether - ideally the EII toolset should provide features of extracting meta information from existing systems. In a simple case, this meta data information will have the identification of various participating systems, their locations, connection information and, most importantly, the schema of the relevant data, the relationships and variations between the data elements of the participating systems. A maintenance environment for this is also needed. All these things constitute the data hub.