Q:

Question: What are the implications to a data warehouse implementation striving toward a zero latency enterprise (ZLE) solution? Is a near ZLE solution currently feasible?

A:

Les Barbusinski’s Answer: Capturing, cleansing, integrating and publishing operational and analytical data in near real time has become a reality in just the last few years with the advent of new ETL and EAI technologies. Source data can be acquired and transformed asynchronously using message-oriented middleware (MOM) software (such as that offered by IBM, BEA Systems and others), real-time ETL tools, such as Informatica, or database replication. However, there are some issues that you need to be aware of:

  • Limit the data that needs to be acquired and/or published in near real time. The cost is considerable, and you’d be hard pressed to justify acquiring all source data for the data warehouse in near real time.
  • Near real-time ETL has to be transactional in nature. Each process must transform small amounts of data quickly, and commit its updates frequently in order to minimize the threat of contention (via database locks) with online analytical queries which are occurring simultaneously. You cannot use batch- oriented ETL scripts for near real-time data acquisition.
  • Data acquired in near real time is usually not as well cleansed or integrated as that acquired nightly. Hence, you may want to store near real-time data in separate tables that can be "unioned" to the standard data warehouse tables by those queries and reports that require up-to-the-minute information. The near real-time tables can then be cleared nightly as the data is reprocessed through the regular batch-oriented ETL scripts.

Doug Hackney’s Answer: It’s certainly fashionable to talk about and include in an RFP. Actually implementing and sustaining one is, however, orders of magnitude more challenging that daily/weekly/monthly frequency.

Mike Jennings’ Answer: Maybe not zero, but near zero latency updating of a data warehouse environment is available today through the use asynchronous XML messaging and Web services to transport, map and transform data. There will be some latency due to transport and processing cycles. Current data warehousing strategies need to consider accommodating translation of XML transactions (DTD, schema) for real-time integration and use in business intelligence environments. This is occurring today through the increased use of XML messaging for integration between operational and data warehousing systems due to easy implementation, ability to be self-describing and neutral format. Data changes in source systems can be immediately detected and sent to the data warehouse for near real-time analysis. Increased use of XML to standardize various industry business processes for application integration (HR – HR-XML Consortium, SCM – RosettaNet) provides additional opportunities for simplifying data integration into the warehouse. This is especially true for data warehouse’s that require near real-time transaction detail

Joe Oates’ Answer: I assume that you are referring to the Gartner concept as "in a real-time, zero-latency enterprise, information is delivered to the right place at the right time for maximum business value." Certainly a "data warehouse-like" central information repository is a necessity.

In my opinion, a true ZLE is off in the future somewhere, except perhaps for very small businesses. Medium to large businesses have dozens to hundreds of disparate operational systems that are the sources for populating the ZLE central information repository (ZLECIR). The task of developing ETL from all of these systems into the ZLECIR would take months or years, even if an organization were willing to pay for the necessary hardware. Additionally, executing the ETL in real time would require significant changes to most existing applications – consider applications that do not produce a transaction log – as well as to rob resources from these applications. Operations managers would certainly not look too kindly at this kind of situation.

That having been said, several of our customers are developing the capability for "CETZLECIR" (close enough to zero latency enterprise central information repository). By this I mean that information is loaded into their data warehouse one or more times a day. Application systems have been modified to use information generated by data warehouse applications so that customer service representatives, salespeople and others who touch customers, suppliers, etc. are able to maximize business through properly organized, easy-to-use information that formerly was not available.

Clay Rehm’s Answer: I don’t see why it wouldn’t be feasible in the right situation. If you can allow transactions to update an operational database and at the same time update a data warehouse, then by all means!

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access