The global marketplace has accelerated the pace of commoditization of products and services. As a result, businesses are seeking to differentiate themselves based on the strategic and tactical use of information. Data collected by corporate systems can be a significant competitive advantage as it can be used to extract knowledge that is exclusive to the company. This knowledge can be used to increase sales, enhance service and reduce fraud. Consequently, companies are collecting more and more data about their customers, products, suppliers and processes. However, this trend is straining corporate architectures because data warehouses and transactional systems are not equipped to handle the onslaught of new data. Further, while business intelligence (BI) systems have matured in their ability to sift through data to find insights, they continue to struggle in enabling a company to take action on the knowledge.

Fortunately, an emerging technology called complex event processing (CEP) can help. CEP allows a business to leverage its data assets by encapsulating knowledge into complex patterns of events and enabling an infrastructure for real-time response. It also buffers corporate networks and data warehouses from being consumed by data as new technologies are implemented to record every micro-event within an organization.

About Complex Event Processing

CEP is a technology for detecting patterns in underlying events in real time in order to identify or infer higher-level events. The idea is that a large number of seemingly innocuous and unrelated events - when properly correlated - may be indicative of very important, actionable events. CEP can be applied to a wide range of applications. In some architectures it sits alongside one or more transactional systems, monitoring a real-time flow of events in order to detect and route anomalies or situations that need attention. Business activity monitoring (BAM), fraud detection and predictive analytics are examples. In other cases, CEP sits in front of transactional systems, enterprise service buses (ESBs) or a data warehouse (DW) transforming streaming data into meaningful events.

Knowledge into Action

DW and BI systems have been successful in helping businesses discover valuable knowledge from corporate data. Analysis is done using OLAP tools or sophisticated data mining algorithms against historical data from the DW. For example, an e-commerce company may discover Product ABC is frequently purchased with Product XYZ. Or, a credit card company can learn that three or more successive purchases of more than $100 from different stores in a 15-minute span of time is likely to be fraudulent. Or, an equity trading firm may find that a buying opportunity exists when a stock surpasses its 52-week high and then falls back below within a 35-second window.

These examples represent knowledge. Each is a pattern of events that has been correlated with an outcome. While the knowledge itself is extremely valuable, a business needs to take action to benefit from the information. However, taking action is generally not the role of DW/BI systems. This is because they have the disadvantage of operating in back-office environments where they do not participate in the live interaction with customers, products, suppliers or operations. For example, DW/BI systems do not interact directly with customers and, therefore, cannot recommend product ABC when a customer is observed purchasing product XYZ. Further, there is generally a high degree of latency before DW/BI systems are updated with the latest data. The latency is often measured in days - much too late for an appropriate action in the examples just given.

Thus, knowledge needs to make its way out of the DW/BI environments and into the interaction systems where action can be taken. CEP can bridge this gap. CEP engines encapsulate knowledge into patterns of events. It then processes real-time streaming data to watch for occurrences of the patterns. When a pattern is observed, the CEP engine can generate a call or an alert to an interaction system such as a product recommendation module, a fraud prevention application or an algorithmic trading program. In a pervasively integrated network, the CEP engine may communicate with an ESB, where the detection of real-time patterns can be made available to all subscribing applications. In essence, CEP is infusing knowledge into the corporate network, enabling a business to capitalize on information in real time.

Efficient Data Acquisition

Other applications of CEP can also increase the return on data assets by transforming the volumes of superfluous and erroneous data emitted by streaming sources into meaningful and manageable information.

The quest for improved visibility and efficiency through use of information has led to the implementation of new devices for acquiring data on business processes. Technologies such as RFID, GPS and other devices have made it possible to collect volumes of data about every micro-event within a process. These technologies have dramatically lowered the cost of data acquisition as they emit or stream information passively without the need for costly human-initiated data capture.

However, while producing data on every micro-event may now be feasible, a challenge remains in turning this data into an asset. Streaming data tends to be in a very raw state. It is often plagued with inconsistencies, glitches and out-of-sequence records. Custom processing is typically required to cleanse and transform the data before it can be useful. Further, even if data integrity is assumed, the raw data is often so granular that the meaning inherent to the events becomes abstruse. As an example, consider an RFID reader that generates records of tagged products sitting on a shelf. If it were to continuously stream this information, potentially millions of data records would be produced every hour. Most of these records would be of little value, simply stating that a product that was there one millisecond ago is still there now. This data is superfluous and requires analysis and processing to identify a change in the shelf's inventory, which is the real event of interest.

However, the processing needed to incorporate streaming data is both too costly and too slow to do within the confines of the corporate network. Enterprise databases and applications were not designed to handle data of this magnitude and velocity.

Here is where CEP can bridge another gap. CEP engines can process streaming data in real time, using pattern-matching techniques to separate the meaning from the noise. It can be configured to output only occurrences of event patterns rather than the raw events themselves, buffering the corporate network from the barrage of raw data. The raw data is allowed to be ephemeral while surfacing and preserving the real meaning. This is all done before data ever lands on disk or enters the corporate network. In the RFID example, rather than processing a data record of every product's existence at every time interval, the CEP engine can output a record only when a product is added or removed from the shelf - this is a relevant event worthy of integration with corporate systems where action can be taken.

The concept of processing data prior to entry into the corporate network is referred to as edge processing. Vendors of RFID firmware and middleware provide some capabilities for this, as do the manufacturers of other streaming devices. However, commercial CEP engines do a good job of generalizing the process so that it can be applied to a wider range of scenarios. For example, consider the extremely granular and often superfluous data produced by these devices:

  • GPS. Is it necessary to record every location reading or just those when location has changed significantly?
  • Environmental sensors. Is it necessary to record every temperature, humidity or tilt reading? Or, is it only important to know when certain thresholds have been crossed? Or, perhaps only aggregate readings over a longer period of time are needed (minimum, maximum, median, mean, mode).
  • Keystrokes on a POS system. Some retail companies have identified patterns of keystrokes that are indicative of employee theft. Is it necessary to record and store every keystroke and millisecond timestamp? Or, is it only important to know when the pattern of collective keystrokes has been detected?

Similar to how BI systems can discover the knowledge contained within the DW, the CEP engine surfaces the meaning inherent to but elusive from raw streaming data. CEP turns the streaming data into actionable real-time events while also reducing the cost of integration with corporate systems.
As commoditization threatens more companies and industries, those that make the best use of their information assets will have the best chance for continued prosperity. CEP is the latest technology to leverage corporate information. It allows a business to take action on knowledge discovered through DW/BI systems and allows for efficient data acquisition. 

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access