Jonathan Wu would like to thank Thadi Murali for contributing to this month's column. Murali, a principal consultant with Knightsbridge Solutions, has extensive experience designing, developing and implementing information solutions for reporting, analysis and decision-making purposes. You can contact him at firstname.lastname@example.org.
Few individuals would dispute the benefits and value that data warehousing has provided by integrating and transforming data from disparate systems and separate business units into information for reporting, analysis and decision-making purposes. However, as products and services become commoditized within a rapidly changing environment, competition rises. The ability of individuals within the organization to analyze, plan and react to changing business conditions in a much more rapid fashion becomes an indispensable element of survival. Data warehousing technology by its very nature involves integrating, cleansing and transforming data, which creates latency in the availability of the information. Since the latency of data warehousing information impacts the ability to address transactions as they are created, a popular misconception that "real-time analytics and data warehousing is an oxymoron" has arisen.
Defining Real Time
Depending upon whom you ask, real-time availability of data could mean days, hours, minutes or seconds. Real time is relative to the availability of the data and expectations of individuals who need that information. For example, in a retail company, transactions are processed immediately at the point of sale. Real time, in this case, is the few seconds a customer is willing to wait between submitting a credit card and having the transaction approved. Conversely, accounting transactions such as payables may be processed in batch mode once a day. Real-time payables and cash flow analytics in this situation would be delayed to the next business day. For the purposes of this article, real time is defined as seconds or subseconds of latency.
Integrating Real Time and Historical Data
Organizations have benefited from incorporating information from a data warehouse into their real-time analytics. For example, a credit card company has a data warehouse that keeps the history of customer transactions. Batch processing analyzes data in the warehouse and identifies patterns, trends and abnormalities that are used to derive summary boundary conditions. In addition, patterns are derived for each customer segment or even for individual customers where history is available. As transactions are processed, they are evaluated against the boundary conditions that were previously derived from the data warehouse and used to trigger alerts if fraud is detected.
Another example is a company that sells music and uses a data warehouse to integrate and store all of its sales transactions from its various sales channels: point of sale (retail store), Internet and direct mail. The sales transactions are batch processed nightly into the data warehouse. Numerous transactions are analyzed after each nightly load to create a summarized version in the form of favorite genres relating to a particular customer or demographic. After the summaries have been developed, the analytic engine takes the latest customer purchase information and uses the summarized information from the data warehouse to cross- and target-sell at the time a customer checks in his/her purchase. Figure 1 depicts the integration of real-time analytics and data warehouse information.
Figure 1: Integration of Real-Time Analytics and the Data Warehouse
Shortcut Approaches to Real Time
In an effort to quickly provide real-time analytics to their information consumers, some organizations are taking shortcut approaches, often leading to inaccuracies, misinterpretations and ineffective solutions. One shortcut approach is to implement real-time analytic systems based on static rules which bypass analysis from a data warehouse. For example, if a credit card company applied this approach to fraud detection, the static rules might not be appropriate or effective at discovering potentially fraudulent transactions. If the static rule were to "alert the customer for possible fraud every time a purchase is made over $1,000," the dollar amount might be too much for one customer segment while eliminating another segment of customers from consideration. Just because a purchase is less than $1,000 does not mean it is not fraudulent; nor can it be assumed that all purchases over $1,000 are fraudulent. For a customer segment whose purchases have consistently been below $100, a purchase of $500 could be a cause for an alert. For real-time analytics to be effective, they have to be well thought out and should consider historical patterns, events and trends. The accumulation of historical information within a data warehouse provides the ability to discover these items, thereby helping the real-time analytic engine to constantly "learn" and evolve.
Another example is the "point of sale" shortcut approach where only a customer's latest purchases are analyzed. Although important, the latest purchases do not represent the customer's previous buying pattern, nor do they provide the ability to effectively target- or cross-sell. At the time of sale, a promotion becomes less effective if historical purchase information is not considered.
In the music retail example previously discussed, the company implementing a real-time target selling solution point of sale and data from a warehouse. The point of sales data was also integrated into the data warehouse and used for direct mailing, e-mail promotions and their customer loyalty program to provide a comprehensive and effective approach to marketing.
While there is a growing need to have access to real-time analytics, data warehousing can enhance the value of this information by adding historical context and perspective. The organizations that were previously described are not only responding in a rapid fashion to their organizations' information needs but are truly taking into consideration the evolving business conditions that are reflected in the data and are realizing significant benefits from their approach. Incorporating information from a data warehouse significantly enhances real-time analytics.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access