Only a vendor could love the advice: "No one tool will do the job, you need to buy both." Though unfortunate from an end-user cost-control point of view, there are specific instances where that is the case for extract, transform and load (ETL) and enterprise application integration (EAI) technologies. Having said that, it behooves prospective buyers to know the difference because in many instances only one class of technology is needed.

Let's get our bearings. The distinctions between transaction-centric and query-centric systems and between operational and decision support systems remain fundamental. If a firm's requirements lie more in the area of synchronizing the transactions hitting an application server with related operational systems, then it is likely an EAI infrastructure will take priority. A data integration platform is not a substitute for an application server or its dynamic functionality. The latter is required to handle any heavy transactional workload, just as a transaction processing monitor was required prior to the explosion of the Internet. The ETL functionality is not directly customer – or business-to-business (B2B) – order facing; rather, it is positioned in one of the layers, such as the data warehouse itself, between the front and back office. Tools such as DataStage XE (Ascential), DecisionStream (Cognos), ETI- Extract, PowerCenter (Informatica) and Sagent ship with dozens, even hundreds, of predefined functions to drag and drop data element mappings. This activity precisely determines the runtime environment of the transformation server or, in the case of code-generating tools (such as ETI-Extract, Oracle Warehouse Builder, SAS Warehouse Administrator or DataStage XE/390), actually generates code (and environmental scripts including JCL) in C, COBOL or ABAP. The data integration features of ETL are such that their design workstations avoid the hand coding to an application programming interface (API) of many of the EAI products. The ETL approach provides audit trails, impact analysis at the meta data level and avoids the need for point-to-point adapters unless dealing with enterprise resource planning or customer relationship management operational silos.

New to the market over the past year is that the near real-time capabilities of IBM's MQ Series are being leveraged across the board by both the ETL code generators and engines by interfacing with MQ Series (and in some cases TIBCO and Vitria) as another data source and target. In short, MQ Series is just another data source or target for the average ETL tool. This is another proof point that the ETL and EAI technologies continue to converge, though there are reasons to believe the convergence will remain partial – namely, the ETL tools lack intelligent routing capabilities. If end-user requirements are transaction intensive, a majority of requirements will be satisfied by following the EAI-server route; whereas, if the requirements are query- intensive, an ETL tool has the best chance of addressing the requirements for transporting and persisting significant volumes of data required to perform aggregation and abstraction. Because many firms have deep requirements in both areas, it is possible to end up with both types of tools. Though Constellar gets credit for popularizing the term "data hub" in the ETL context, it failed to master the complexities of market positioning and fell into the gap between ETL and EAI. The acquisition of Constellar by DataMirror is reportedly contributing to DataMirror's own transformation into a data integration firm with roots in both log-level replication (of use in near real-time synchronization) and data aggregation (traditional ETL).

A key differentiator between ETL and EAI remains the architectural commitment to intelligent routing on top of a store-and-forward mechanism characteristic of New Era of Networks (NEON), which was acquired by Sybase. It is a significant breakthrough that the ETL vendors, such as Ascential (formerly Informix/Ardent), Informatica and ETI now interface with MQ Series as just another data source. They also produce and consume extensible markup language (XML) as an additional, new, standard data type. This will have significance for standards-based meta data interoperability. Watch for Sybase to integrate the EAI tool with its enterprise portal – a function already performed by Ascential's Axielle portal – with its ETL function by leveraging XML. When questioned about mechanisms and support for managing the transactional boundary as events come off of MQ, the ETL vendor can count records, respond to events, deploy a toolkit of predefined many-to-many functional transformations or drop down into standard programming languages, such as C or COBOL (but not Java, as can the EAI), to code custom, plug-in solutions. ETL vendors do not currently seem to aspire to provide intelligent or dynamic routing of transactions, so it is possible that a coexistence strategy will result. The "gotcha" is that the convergence is likely to remain incomplete, and those firms with both query- intensive and transaction-intensive applications will require both ETL and EAI technologies to satisfy the full spectrum of their data integration requirements.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access