Continue in 2 seconds

Customer Analytics: It's All About Behavior

  • June 01 2004, 1:00am EDT

Not all 45- to 55-year-olds with a household income between $50,000 and $75,000 have the same purchase interests and spending habits. For this reason, static demographic data should not be used as the building blocks of a well-defined customer segmentation system. Demographic data may be used to describe customer segments (profiling), but it is much less effective in distinguishing interests and spending habits than customer behavioral data.

Behavioral data goes beyond knowing that a customer has purchased a certain product. It involves capturing customer events and actions over time and using these stored interactions to determine typical behavior and deviations from that behavior.

Customer analytics exploit customer behavioral data to identify unique and actionable segments of the customer base. These segments may be used to increase targeting methods. Ultimately, customer analytics enable effective and efficient customer relationship management. The analytical techniques vary based on objective, industry and application, but may be divided into two main categories.

Segmentation techniques segment groups of the customer base that have similar spending and purchasing behavior. Such groups are used to enhance the predictive models as well as improve offer and channel targeting.

Predictive models predict profitability or likelihood and timing of various events based on typical customer behavior and deviations from that behavior.

The Business Processes

Business processes drive the needs and objectives of the analytic projects. The data miners who develop the analytical applications should have a basic knowledge of the business process at hand and work in conjunction with departmental representatives. A clear understanding of the business process as well as the outcome desired by the various departments is essential. Setting the analytical objectives in accordance with all departmental objectives gives everyone an understanding of every function in the model building and application process. Marketing, IT and analytics must understand each role and how each output affects subsequent inputs. For example, if analytics and marketing do not understand the IT infrastructure and database configuration prior to deciding on the analytical techniques and data that will be used, then the amount of time required to implement the analytic solution may not be cost-effective. The first step in every analytical process is to bring all parties together and map the expected timelines of the process.

Customer interactions include browsing, purchasing, paying and communicating with customer service or sales. It is these interactions that may be used to develop customer profiles and eventually predict future actions. It is important to understand which of these interactions were marketing-driven and which were due to chance. Understanding the impact of marketing, risk and customer service decisions and how those decisions influence customer behavior must also be tracked. The knowledge that a customer purchased due to a marketing campaign may be used to optimize future campaigns. Customer behavioral analysis combined with increased marketing efficiency will enhance future customer interactions. Setting aside time series samples (control groups) will allow for marketing effectiveness tracking as well as the continued enhancement of the customer analytic applications.

The Data

Once all interactions and their data sources are identified, the next step is to develop a time-dependent data repository. It is important that the data is stored in time series. True behavior can only be identified over time. The sudden increase of inexpensive storage and data warehousing techniques has made the once unimaginable a reality to all industries. Issues in the past included access to centralized data and the time needed to identify and extract the data. Now, with massive data warehouses and seamless extraction tools, models may be built and deployed faster than ever. Combining all available data sources and developing the data warehouse can be difficult and time-consuming, but is well worth the effort. When predicting events and developing segmentation systems, it is important to understand that the data and the correct use of the data will maximize the results.

The most time-intensive part of the analytical process is data extraction and transformation. Transforming the raw data into actionable behavioral-identifying attributes takes work; but over time, the process may be streamlined. This process matches and aggregates the database sources into the final data set and is best completed outside the database environment. Other tools, including various analytical software packages, have been optimized to complete this type of batch processing. Databases have the overhead of referential integrity, data consistency rules, undo logs and redo functionality, all of which slow the processing and data transformation required when building the "minable" data set.

The best way to determine the transformations needed is to work backward from the event to be predicted (the dependent variable). Based on the modeling objective, list the behaviors that are known to indicate that potential action. For example, if a mortgage company would like to identify customers who have a greater chance of paying off a loan in the next three months, identify those behaviors that indicate said potential outcome (e.g., a sudden stop in prepaying principal, rate checks on the Web site, current market interest rates more than 1 percent less than their rate and revolving credit lines significantly increased the past few months).

Once all the possible actions are listed, identify the data attributes needed to develop the final transformations. During this process, extract as much data as is feasible. Although certain attributes may not seem relevant, the interaction with additional behaviors may enhance the predictions and segmentations. It is the responsibility of the statistician and data miner to select the most favorable combination of attributes. Given obvious time constraints, when asked if a particular attribute may be useful, the answer is almost always "it can't hurt."

When available, use customer interaction data prior to purchasing additional attributes. Depending on the objective, models utilizing rich customer behavioral data will outperform models with only static demographic data by as much as 300 percent. Once the customer behavior data has been used, research and test additional enhancement data sources. Enhancement data provides a key role in customer analytics. Purchased demographic, lifestyle and credit data may be used to help describe model outputs, link customer segments to similar prospects and enhance the development of prospecting models.

The Analytical Techniques

As mentioned, data mining techniques may be placed into two categories. Predictive models use previous customer interactions to predict future events, and segmentation techniques are used to place customers with similar behaviors and attributes into distinct groups (clusters). Similar groups and predicted events allow marketers to optimize their campaign management and targeting processes.

Predictive Models

Predictive models attempt to predict a binary event (e.g., respond, purchase, default) or continuous types of outcomes (e.g., margin). The statistics and data mining techniques vary based on analytic objectives. When trying to predict a binary event, the predominant techniques are logistic regression and neural networks. When predicting continuous effects such as contribution to margin, the predominant techniques are neural networks and regression. Additional techniques have been optimized for those modeling objectives that do not fit into the previous categories. Poisson regression may be used to predict ordinal outcomes (e.g., 0, 1, 2) and hazard models may be employed to predict the time to an event. Examples of their use include predicting the number of collection calls needed to receive a payment and optimizing retention efforts (time until attrite).

The data miner's experience is a primary driver for which technique should be used. Several appropriate modeling techniques are typically available for each objective and tend to produce similar results. When selecting techniques, other issues to consider are the number of variables selected and the model application environment. For instance, neural networks tend to choose more variables than regression techniques. This may cause issues in future model scoring or implementation costs if enhancement attributes are used. Also, the best model may not produce the greatest ROI if it contains expensive enhancement attributes or customer attributes that are very difficult to implement. The total cost to implement the analytical solution should be taken into account when selecting the final model.

During the model build process, the optimal transformation and combination of variables must be determined (data reduction techniques). These techniques vary based on the modeling technique used. Regression techniques tend to see linear relationships well, while neural network and various other techniques are superior in noticing nonlinear relationships. The goal in data transformation and reduction is to maximize the relationship between the variables used to predict the event (independent variables) and the event of interest (dependent variable) while minimizing error.

Many times, the final model variables may not be the best individual variables. For example, if age and income are the most predictive attributes, one of these may not be selected in the final model because age is highly related to income - as age increases, so does income. Much of the information collected within age and income independently overlaps, and this overlap creates error. Data reduction techniques minimize this error in order to maximize the model's predictive power. Once the final model is selected, it should be verified with a separate data set. The verification data set may be a sample from the same time frame; however, if possible, it is best to have several verification data sets from various time frames. Verification data sets from alternate time frames ensure that the modeled data set was a good representation of the customer universe. Seasonality issues may also be explored with several verification data sets from different time periods.

Segmentation Techniques

The goal in developing a segmentation scheme is to place customers in groups that are as similar as possible. As the behavior variation within the groups decreases, the results of the targeting efforts will increase. These groups may be used to target product, channel and creative factors as well as several additional marketing and risk components. These groups also enhance the predictive models. Segmentation techniques vary from age by income cells to full clustering systems.

Many companies deploy segmentation schemes by cherry-picking those in certain demographic attributes and using 5 to 10 groups as the basis for all marketing efforts. This is a good start; however, if customer interactions are tracked and stored, more sophisticated techniques are available. These techniques use all customer behavior data available to develop unique clusters that are similar in interest, browsing and purchasing behavior. Once the clusters are developed on the customer behavioral data, then additional demographic data may be appended to obtain a better understanding of the clusters. Knowing whether to include the demographic data within the clusters requires knowledge of the customers and marketing requirements. The clustering techniques provide output statistics, which help determine if a certain clustering system provides more "distinct" clusters.

Similar to predictive models, the first step in all segmentation studies is to define the objectives with all departments and extract the data. Each attribute used should provide information about the customers' interactions with the company. Additional enhancement data may also be used; however, the cost associated with that data should be considered when comparing clustering systems with and without the demographic data. If it is determined that the demographic data does not provide additional segmentation, then that demographic data may be used to describe the final clusters. When demographic data is used to describe clusters, it is a simple process of appending the data after the clusters are built to construct demographic profiles for each cluster. If the demographic data is used to build the clusters, then that data must be updated (repurchased) when the clusters are updated. The update schedule varies based on customer volatility, customer interaction frequency and data needs.

When developing clusters, it is best to standardize the variables. Standardizing the data allows each attribute to contribute the same information. Data transformation techniques are also available if marketing or the data miner would like certain attributes to contribute more or less to the final clusters. After the attributes are standardized, they may be placed within the clustering technique of choice. If enough time is available, a principal component analysis (PCA) should be completed. PCAs squeeze out much of the overlap across the input variables, reducing error. The output of the PCA is then placed into the clustering technique.

Two categories of clustering techniques are available - hierarchical and non-hierarchical. Non-hierarchical techniques place customers into a predetermined number of segments while hierarchical techniques start with the full customer base and then develop splits based on the attributes. The various clustering techniques include K-means, nearest neighbors (single linkage), farthest neighbor (complete linkage) and average distance. K-means is a simple non-hierarchical method that is often used.

Clustering techniques are powerful methods. If used properly, they will increase overall marketing effectiveness. It is best to develop the clustering scheme(s) prior to building the predictive models. This allows for an up-and-running descriptive targeting tool that may be used to enhance the predictive modeling efforts. Knowing in which cluster(s) a customer resides places valuable information into the modeling tool because the clusters contain individuals with similar behaviors.

The Application

Similar to data transformation, model application and scoring is best completed using a tool outside the database environment. This process includes extracting data, transforming the attributes, scoring the models, updating the clusters and storing the results back in the database. Placing the scores and clusters in the database allows marketing, customer service and risk to use this information for all customer relationship management (CRM) decisions. In accordance with the time-series fashion of the data warehouse, the scores and segments should be retained over time. This allows for future analyses of how customers migrate from cluster to cluster or how their model scores increase or decrease over time. It is this "change in behavior" that may be used to further enhance future models.

Predictive modeling and segmentation application issues tend to encompass data timing. The data warehouse loading interval must be considered during the model planning and application stages. The age of the data when it is available, the time required to apply the model and the time to complete the campaign process must be determined prior to selecting the model, and verification samples must also be considered during the application process. Data timing also refers to how recent the data must be to deliver the required power. The most advantageous scenario would be working with up-to-the-minute customer interactions at the time of the proposed marketing or risk intervention. In most instances, however, that requires more resources than it is worth. Additional analyses should be completed to compare the power of recent data to the resources required. Data recency is defined as the amount of time since the last activity with the customer. The data recency analysis should compare the results of marketing and risk applications on data of various ages (e.g., one day old, two days old, one week old).

Customer analytic applications enable enhanced customer relations resulting in increased efficiency and customer profitability. The data warehouse and analytic applications deliver a suite of capabilities that drive revenue into the business units. Although it is the data and the analytical techniques that provide the segmentation, the business process should drive the data-driven decisions because spending valuable time to create a powerful model may not add efficiencies. Business processes can only be made more efficient when data about the process is captured and stored. Data, especially customer behavioral data, allows the analytical techniques to effectively segment and predict. It is these segmentations and predictions, known as customer analytics, that drive long-term customer profits and loyalty.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access