Radha would like to thank Ajeshkumar Vijaydas for contributing this column.


In the insurance industry, it is common to engage customers in repeat business by offering value-added services in the form of high-class customer service and promotions. Typically in the health insurance industry, companies can identify multiple cross-selling opportunities by crunching data in the underlying customer and transactional databases. These opportunities can be identified through predictive modeling and data mining techniques. High-end mathematical and statistical theories help analysts develop appropriate models for usage in predictive analytics.


In this column, I introduce a chain of data mining and predictive modeling techniques to identify the probable shopping patterns of customers buying related insurance products.


Initially, customers are segmented based upon their age, region, income, length of relationship and product coverage. The hypothetical variables are then derived from transactional data, past illness data and demographic data.


Consider individual and family health plans from different products. Start by creating different segments with the identified variables to form the linkage as a total number of active insurance coverage and independent variables from transactional data. Some examples of variables are:


  • Insured ID,
  • Plan name (product type);
  • Plan type;
  • Length of relationship;
  • Total insured amount;
  • Premium amount and socioeconomic status data like age, gender, region, ZIP codes employment status and years of experience;
  • Employed in public/private firm;
  • Income class; and
  • Past illness.

Three sets of plans are selling widely across a particular region and, from the total plan offerings, 15 types of plans are popular for modeling. SAS enterprise software is used for modeling under the SEMMA framework. Figure 1 explains the methodological framework for modeling in SAS Enterprise Miner.





The objective is to target three business opportunities:


  1. Best segments and underlying business rules,
  2. Customer scores for multiple cross-sell opportunities and
  3. Products purchase sequence.

Exploratory Data Analysis


This stage includes exploratory data analysis (EDA), such as data understanding, missing value check, outlier verification, cross-tabulation, data subset creation, visualization and statistical hypothesis creation for the final modeling.




There are multiple steps in the modeling process. First is variable selection. The second step includes decision tree creation and insight creation of multiple predictive models to compare the decision tree model and scoring method for each customer.


Based on the EDA and the variable types, the variables are short listed for the segmentation using variable selection node options. The R2 criterion uses a quality-of-fit criterion to evaluate variables. It uses a stepwise method of selecting variables that stop when the improvement in the R2 value is less than 0.00050. By default, the method rejects variables whose contribution is less than 0.005.


In the decision tree, we have used the Chi-Square Automatic Interaction Detector (CHAID) method for data classification. In CHAID, attribute splitting is based on the chi-square test. The chi-square equation is x2=Ó(fi-fe)2/fe. The recursive algorithm is a greedy heuristic search for a simple tree, but it cannot guarantee optimality. The main decision in the algorithm is the selection of the next attribute to condition on.


The key is to use attributes that split the examples in to sets that are relatively pure in one label; this way we are closer to a leaf node. The most popular heuristic is based on the information grain.


The attribute will split and merge according to the chi-square test, and at each step the level of significance is adjusted based on Bonferroni adjustments due to multiple tests being performed. Bonferroni adjustments depend on the number of tests.


Among the Least Squares, Gini reduction and Entropy reduction, Gini with the model assessment measure of “total leaf impurity” showed the best model. The Gini/information gain index is interpreted as the probability that any two elements of a multiset chosen at random with replacement are different. A pure node has a Gini index of 0. As the number of evenly distributed class increases, the Gini index approaches 1. The Gini index formula is illustrated in Figure 2.


Entropy is another measure of variability of categorical data. Entropy (impurity, disorder) of set of examples, S, relative to a binary classification is: Entropy (S) = -p+ log (p+) – p- log (p-), where p+ is the proportion of positive examples in S, and p- is the proportion of negative examples. If all the examples belongs to the same category, then Entropy = 0. If all the examples are equally mixed (0.5, 0.5) Entropy =1. In general, when pi is the fraction of examples labeled ‘i’, then Entropy is defined in Figure 2.



Entropy can be viewed as the number of bits required, on average, to encode the class of labels. If the probability for + is 0.5, a single bit is required for each example.


There are two scenarios of decision trees tried with two different target variables. First we use a total number of products owned and run the CHAID for the same. Second, we run the CHAID and a neural network with multiple product owners with binary coding “1=Yes” and “0 = No.” Based on the model-comparison node, the first scenario is finalized and assessed with the help of assessment nodes. The decision tree showed the highest lift values within the top 40 percentiles.


The final node diagram for the modeling is shown in Figure 3.



Also for easy understanding, the model comparison plot is given in Figure 4.



Figure 5 gives the node definition with life values for a few terminal nodes. This helps to identify the profitable customer segments.



The lift values of the top four nodes are greater than 1, and the customers in these four nodes were selected to find the sequence of the next products. (The four groups are identified as the best segments for cross sell.) To find product sales sequence analysis, SAS Enterprise Miner’s association node was applied. This is the one objective function of this modeling. The product purchase date is needed to run the association node. The next table is the top 10 product sequences. Product Type 1 _ Product Type 3 showed the highest frequency of product purchasing sequence followed by Product Type 3 _ Product Type 1. Figure 6 gives the sequence of product purchase and Figure 7 gives the predicted status and the likelihood scores.




Figure 7 shows the identified segments of customers with the type of products holding. Also a predictive likelihood score is assigned for each to have multiple products. To identify the products’ cross sell opportunities, the scores are calculated as follows.The objective is to predict the customers who will buy Product Type 1 among current Product Type 3 customers and to predict the customers who will buy Product Type 3 among current Product Type 1 customers. Because so many models are required, only the top four nodes were chosen. Each was then scored for individual customers. Next, the cross-sell scores were computed. Figure 8 is an example of the scores.



Other Applications in Insurance


There are common applications in health insurance to design customer’s best-choice plans. In part two of this column next month, I will explain the choice model for more suitable product combinations for cross-selling. Apart from health insurance, successful scoring models can be developed for pension schemes, motor vehicle insurance, dental insurance, travel insurance, life insurance and property insurance. Customers can be targeted for selected profitable segments and specific products selected to increase profitable sales.


This column specifies the customer level cross-sell models to increase sales and acquire more profitable customers. Each customer belongs to either single membership or multiple memberships depending upon the number of products and product types held. If a customer is holding more than “n” memberships, the customer may belong to an employer level, and an employer-level model can be built for more detailed profiling of customers. This also helps the marketing and sales team to plan the campaign for cross-sells, direct mailing, etc.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access