Predictive analytics is widely touted as one of the "next big things" in business intelligence. Yet most discussions of the topic really describe the "same old things" that statisticians and business analysts have been talked about for years. These are models and scoring systems that predict the likely behavior of customers, prospects and (outside of marketing) other entities such as prices and demand. These always were, and still are, immensely valuable applications of business data. But they are not new.

The generic descriptions of predictive analytics include a few hints of fresh approaches. Even these are not truly new ideas, but they are just now becoming practical for broad implementation. This is mostly due to the recent availability of infrastructures such as data warehouses and integrated customer management systems needed to supply the necessary data and deliver the predictive analysis results. These newer components include:

Real-time execution. Traditionally, predictions were made in batch processes such as scoring prospect pools for outbound marketing campaigns or assigning a customer rank to guide customer service treatments. Scores were static, calculated either once or at regular intervals. This largely reflected the difficulty of assembling the underlying data - remember when most data warehouses were updated monthly? - and of building the predictive models and applying the scores.

Today, predictions are increasingly calculated as needed and as new data is generated during real-time interactions. Pricing is adjusted based on current inventory and continually revised demand forecasts. Offers are optimized through constant testing of alternatives. A prospect's most recent search request is factored into the choice of which products to offer next.

Front-line deployment. Real-time analysis is most important when it can change the course of an ongoing interaction. This means that analysis must be pushed to front-line customer interaction systems such as call centers and Web pages. These systems must be able to capture the interaction data, transmit it to the predictive analytics engine as part of a prediction request and receive the engine's output. Or, the engine itself must somehow be embedded within the front-line system, perhaps by having the front-line system do scoring calculations using a formula the analytics engine has generated. Service-oriented architectures (to mention another next big thing) make this sort of integration much easier than it used to be. What's new, and important, is the ability of the front-line system to choose which predictive analysis capabilities it will execute, rather than passively accepting whatever predictions were prepared in advance. As a result of this new ability, the front-line system can string together multiple predictions as needed to guide the course of an interaction.

Customer-level evaluation. Traditional scoring methods made choices about groups of customers, such as which groups to include in a promotion or which groups would be offered each product. Of course, the groups were still made up of individuals, but everyone within a group was treated the same and all analysis of results was conducted at the group level. A model that performed well on average was considered a success.

Newer predictive analytics applications make decisions about specific individuals in specific situations. New decisions are needed as a situation evolves, so several scores may be generated during the course of a single interaction. This means that customers cannot be treated as members of large, undifferentiated groups, because small differences in individual history become important factors in selecting treatments. Success is not measured by how accurately a specific model predicts results but in the value attained from an interaction. Success should really be measured across many interactions by tracking the long-term change in customer value.

These three changes - real-time execution, front-line deployment and customer-level evaluation - pose significant challenges for traditional predictive analytics techniques. When predictions were created by expert users and deployed in a small number of large projects, the users could be trusted to carefully monitor the quality of the incoming data, models and final applications. But in the new situation, many more models are used by less sophisticated users without a simple way to measure the quality of the results. The only practical way to create the required number of models is to rely on automated methods, but this removes the expert model builders who are most likely to identify problems or opportunities for improvement. Stringing together multiple predictions during an interaction makes it likely that certain combinations of outputs will yield poor results, but it is hard to find those combinations among the majority that work correctly. Bad data on individual customer records, which has little impact when the customers are lumped into large groups for simple treatments, becomes more dangerous when it is used to generate specific recommendations during one-on-one interactions.

In an automated front-line system such as a Web site, there are no human users to notice when things go awry. Even in a call center, agents are likely to be trained not to question system recommendations and, once they have gained confidence based on initial success, are unlikely to examine them critically. Nor can they intuitively judge whether the recommendations are optimal when the ultimate measure of success is long-term customer value. Making error detection still harder, the systems must be designed to function even if the models are failing. This typically means supplying default recommendations in place of the model-generated ones. Although this is operationally appropriate, it also means that end users may not even be aware that a problem exists.

Addressing these challenges imposes new requirements on predictive analysis systems. The most obvious is the need for fast, efficient model building and deployment. This typically requires using nontraditional techniques to develop and validate the models, because traditional methods tend to be slow and labor-intensive. The resulting models may not be as accurate as those built with traditional methods, but it is usually a worthwhile tradeoff to give up some accuracy in return for getting more models more quickly.

The new system must be able to monitor its own performance and adjust when old models stop working well. There are many reasons such a fall-off may occur, ranging from changes in source data, to temporary interruptions in data feeds, to competitive and environmental factors, to evolution of the underlying customer behavior itself. Different problems call for different solutions, but the system must always compare its results with a minimum acceptable performance level and react when that level is not met. This implies an automated feedback mechanism. When results are not satisfactory, at a minimum the system needs to inform its administrator and switch to a default mode. Ideally, it will also run automated diagnostics to identify why the recommendations are not performing and take corrective action.

The same fundamental ability to monitor results supports automated optimization. The simplest form of this is to create competing models, test them against each other and adopt the winner as the new standard. If this is done continuously and automatically, such an approach naturally adjusts over time to whatever changes cause old models to fail. A more sophisticated approach to optimization moves beyond head-to-head testing of individual models to find when a single model can be replaced with multiple models, each of which applies in a particular situation. This often makes sense in startup situations where the amount of historical data is initially limited but will grow over time.

A third major requirement of predictive analysis is that the system easily integrates individual models into the larger flow of customer interactions. A single interaction may involve multiple predictions, and a long-term customer relationship will involve many interactions. Part of this integration is technical; it must be easy for managers to define an interaction process and specify the points within this process where predictions will be requested, received and used. But the more difficult part of the integration comes back to response analysis; when assessing results, the system needs to be able to measure the independent performance of individual models as well as the result of several models working in concert. This means the response analysis function must identify and link prediction requests that are part of the same interaction and relate these to an ultimate objective.

The linkage might involve nothing more than an interaction identifier similar to the session ID used to link different events during a Web site visit. A more powerful approach would actually import the structure of the interaction - that is, the rules governing the interaction flow - from the front-end system itself. This would allow the system to simulate the outcome of alternative decisions, which is necessary for true optimization.

Optimization also requires that outcomes be judged against an ultimate objective. This could be maximum revenue or profit contribution or an estimated change in long-term customer value. Of course, building a customer value model is a substantial predictive analytics project in itself, so this model might exist apart from the portions of the system that use predictive analytics to guide interactions.

It is easy to get carried away with the vision of an autonomous, self-optimizing predictive analytics system. In reality, such systems operate within the narrow frameworks provided by human-designed interaction flows and treatment alternatives. The potentials for both benefit and harm are, therefore, somewhat lower than they might seem. Yet even within these constraints, real-time, front-line, customer-level predictive analytics offer substantial new opportunities to increase business value. They are definitely worth doing - so long as you take the time and effort to ensure they are done right. 

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access