JAN 28, 2010 3:02pm ET

Related Links

CIO Stepping Stones to Success
February 10, 2012
Birst Automates Connections to Big Data
February 8, 2012
Rising to the Enterprise App Demand?
February 8, 2012

Web Seminars

6 Key Things to Fast Track your Mobility Strategy
February 23, 2012
Why Getting Started in MDM Doesn't Have to Be Difficult
February 29, 2012
Dashboards: How's Business? Ask your Data!
March 15, 2012

How Smart is Real-Time BI?

Print
Reprints
Email

We live in real time, minute by minute. News is no longer delayed by days or even hours; it is streamed in real time. We bank online and check our real-time balances. We book flights with real-time visibility of seat availability. Sales patterns change over time and from place to place. Currency valuations shift and alter profit margins. Balancing on this shifting terrain, business managers are expected to focus on business analytics and make informed business decisions. Real-time business intelligence ensures accurate data flows across the enterprise so that organizations can make quick decisions on pricing, shelving, service and product mix, based on the latest information.

According to Dr. Richard Hackathorn, creator of the Time-Value Curve, “the value of data is directly proportionate to how fast a business can react to it. In other words, a corporation loses money every time it delays getting information into the hands of decision-makers.”

Real-time BI is crucial to survive in this competitive world. It is important to understand the new challenges that must be addressed and develop a solution that will handle the requirements and technology hurdles at hand.

Real-Time Business Intelligence

The major goal of real-time BI is reducing the time taken for corrective action or initiative. Real-time BI is designed to control data latency, analysis latency and action latency. Companies must understand that ROI will also depend heavily on the ability of an organization to modify its business practices to take advantage of improved responsiveness in the IT system.

A real-time BI system has two main components: real-time data integration and real-time decision-making. The objective of the real-time data integration component is to capture business events from operational systems and integrate them into a low-latency store. This component supports real-time, data-on-demand processing. The real-time decision-making component, on the other hand, supports real-time performance management and real-time predictive analysis. Figure 1 gives an overview of real-time BI architecture.

Challenges of Real-Time BI

BI applications include the activities of decision support, query and reporting, online analytical processing, statistical analysis, forecasting and data mining. Each of these components needs to be designed to operate in a real-time environment, and there can be many challenges in designing such system. Some major challenges include:

Designing real-time ETL. Traditional ETL tools are batch oriented, wherein the data becomes available as some sort of extract file on a certain schedule, usually nightly, weekly or monthly. Then the system transforms and cleanses the data and loads it into the data warehouse. ETL tools tend to update systems with complete files, not compact amounts of change data. However, for real-time ETL, a continuous flow will be required throughout the day with minimum latency. Real-time operation requires the synchronization of data across multiple layers of an organization and many different sources. Connecting the large and diverse array of data sources to a real-time warehouse is highly complex.

Data modeling for real time. From a data architecture perspective, real-time data warehousing challenges the posture of the data warehouse as a system of periodic measurements, advocating the requirement for a system of more comprehensive and continuous temporal information, i.e., a real-time database model that deals with the temporal nature of data.

Search, OLAP, and query and reporting. Today's query and OLAP tools, not having been designed with real-time warehousing in mind, can produce unanticipated results.

Scalability. To support real-time processing, the system must have a scalable and flexible back-end database environment for loading and administering large amounts of data. The database must also be able to handle mixed workloads, since the tasks used to update low-latency stores will need to run in parallel with real-time decision-making applications. Real-time processing will involve real-time alert reporting through emails or messages. These alerts need to be designed to operate on real-time data feeds.

Suggested Solutions

An ideal real-time BI tool will be the one which can answer all the above challenges. Experts all around the world are studiously working to develop such a system and have come up with many approaches to design a real-time BI. Some approaches are briefed here.

Micro batch ETL. A data warehouse can only be considered real-time, or near real-time, when all or part of the data is updated, loaded or refreshed on an intra-day basis, without interrupting user access to the system. Convention ETL, file based approach is extremely effective in addressing daily, weekly and monthly batch reporting requirements. Micro batch ETL designed on log based, real-time change data capture technology can provide a nonintrusive means for real-time data acquisition from an operational data source. Figures 2 and 3 give a pictorial presentation of the system.

Log-based change data capture technology captures data changes in the source system as they happen and flows them immediately to the target system,, ensuring business information is always reliable and timely. Most database management systems manage a transaction log that records changes made to the database contents and to metadata. By scanning and interpreting the contents of the database transaction log, one can capture the changes made to the database in a nonintrusive manner. The principal task of the CDC process is to scan the log and write column data and transaction-related information to the CDC change tables. It detects when tables are newly enabled for CDC and automatically includes them in the set of tables that are actively monitored for change entries in the log. Similarly, disabling CDC will also be detected, causing the source table to be removed from the set of tables actively monitored for change data. When processing for a section of the log is finished, the capture process signals the server log truncation logic, which uses this information to identify log entries eligible for truncation.

Filed under:

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.