FEB 1, 2006 1:00am ET

Related Links

When Fast is Not Enough
July 18, 2008
TopQuadrant Software Imports Email MetaData into Semantic Applications
March 26, 2008
An Open Challenge to the Open Source Community
November 30, 2007

Web Seminars

6 Key Things to Fast Track your Mobility Strategy
February 23, 2012
Why Getting Started in MDM Doesn't Have to Be Difficult
February 29, 2012
Dashboards: How's Business? Ask your Data!
March 15, 2012

Real-Time Data Warehousing Merges with Operational Reporting: How Will You Manage?

Print
Reprints
Email

Since the early days of decision support system development, the latency between when transactions occur and when the data is available for reporting has been a headache for managers. Lately, new technologies such as messaging and enterprise application integration (EAI) software have provided increasingly better capabilities to build real-time data warehouses and better integrated analytics. This article looks at some of the options organizations have when attempting to increase the traditional decision support system's ability to also support timely operational reporting.

Data latency is an old reporting problem. In the 1960s and 1970s, managers had to load data from punch cards and tapes before reports could be generated. In the 1980s, specialized transaction systems and the emergence of PCs created a proliferation of spreadsheets and desktop tools that had to be periodically updated with fresh data. Some organizations were meeting the challenges of real-time operational reporting by simply allowing users to query the transaction systems. For most organizations, this was not an option. There were simply not enough system resources to support a large number of users accessing gigabytes of data. Allowing extensive querying of the systems could also delay critical processing of transactions.

In the 1990s, the enterprise resource planning (ERP) revolution reduced the number of transaction systems operated by organizations. At the same time, the evolution of better databases allowed companies to create specialized decision support systems, such as data warehouses and data marts that extracted data from a variety of sources and made it available for analytics. However, the core issue of data latency still existed, and the ultimate goal continued to be an analytical system with as near real-time access to operational data as possible. This would provide better analytics capabilities and better operational reporting.

XML and Messaging

In the late 1990s, companies explored XML as a silver bullet to get timely transactional data into data warehouses. The idea was to provide instant synchronous updates to the decision support systems as the transactions occurred. While the concept sounds simple, it has major implications. First, the transaction system has to create a fixed format document for each transaction, something that can be quite time-consuming. Second, the documents often become large due to tags and metadata embedded in each record. For example, transactions based on the proposed XML protocol - extensible messaging and presence protocol, or XMPP - carry both open and end tags for each data point. If you want to send a simple record with a first and last name, it may look like:

<first_name>Jim</first_name>

<last_name>Smith</last_name>

While the record only contains eight characters (Jim Smith), the transmitted document contains 55 characters. The overhead is even higher due to additional tags that describe data types, definitions, headers and details. Naturally, XML-based messaging protocols such as XMPP have not been used extensively by very large data warehouses that may receive millions of transactions each day. However, XML has often been the backbone of marketplaces, short message services (SMS) and custom Web applications to transfer transactions to back-end processing systems such as SAP and Oracle.

Other vendors have taken a more proprietary approach to messaging and successfully created interface standards such as electronic data interchange (EDI) or IDocs that simplify the formatting and transporting of transactional records. The reduced overhead of these formats has allowed companies to automatically send records into their newly coined real-time data warehouses.

Instant Messaging for Operational Reporting in the Data Warehouse

It is amazing what two years will do in terms of standardization. In 2003, the major contenders in the messaging standardization race were XMPP and a standard known as session initiation protocol for instant messaging and presence leveraging extensions (SIMPLE).

As mentioned, XMPP was great at handling simple records such as SMS traffic, but it had a huge overhead when transmitting large volumes of transactions (something it was never intended to do). On the other hand, a drawback of the competitor, SIMPLE, was that it provided core support for single text messaging but had little support for other formats. Therefore, each vendor had to build their own extensions, which were often incompatible. Another problem with SIMPLE was that it supported the old user data protocol (UDP) as well as transmission control protocol (TCP) in the transportation layer. Because UDP has few quality controls, data packages can be dropped and data lost with limited ability to restart or track the process. This was not good for large reporting systems that relied on timely, accurate and complete data. The first versions of SIMPLE were not extensively used for real-time data warehouses and reporting systems.


Figure 1: Using Basic XML to Push Data to a Decision Support System

Microsoft Becomes an Enterprise Application Integration Provider

While SIMPLE had some issues, it was a great platform for vendors to launch their initiatives. In 2003, Microsoft worked on a project called "Real-Time Communication Server" to enhance the SIMPLE protocols. In 2004, Microsoft launched a new version of their messaging product known as BizTalk Server 2004. It had two ambitious goals. First it aimed at providing B2B integration. Second, it aimed to become the platform of choice for EAI. This is the integration of both transactions systems and critical reporting systems within an organization.

With this latest release of BizTalk, which is based on the .NET platform, Microsoft provided a clearer alternative to the very confusing standardization race that had literally dozens of overlapping standards and approaches to EAI. The core architecture of BizTalk 2004 is a simplified server system. For a decision support system in an EAI framework, BizTalk provides the Business Activity Services (BAS) to be installed on the source system side to provide the messages. The administrator of the data warehouse can also monitor the load process from many source systems using BizTalk's business activity monitoring (BAM) tool.

Filed under:
SOA

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.