According to Aberdeen Group, by 2003 CRM spending will exceed $24 billion. Most of this money will be spent on hardware and software, which are major focuses of both CRM and database marketing. Much of this money will be wasted, however, because not enough attention is paid to the most critical ingredient in a CRM program: the data.

Major corporate initiatives develop from events as minimal as an executive hearing a single comment in one focus group or as massive as a data mining exercise on a terabyte-size CRM database. Ultimately, the soundness of these initiatives depends on the quality of the data.

Database marketers and CRM managers are very familiar with "data cleanliness" issues. Merging and purging of customer lists are standard operations in the business. Unfortunately, these operations are too often the extent of data appropriateness considerations.

The soundness of database marketing and CRM decision making is dependent on relevant data. Massive transactional files, appended demographic and behavioral data are key to good decision making only if the variables included are relevant.

What is meant by "relevant"? Data is relevant if it meets two criteria: It answers the questions management needs answered in order to meet the enterprise's objectives, and it is actionable.

Management requirements depend on many factors. For example, in the case of telecom and energy utilities, the stage of deregulation will set the parameters. Prior to deregulation, companies in these industries must determine what factors will produce loyalty among their current customers. After deregulation, they must implement up-sell, cross- sell and win-back strategies. Finally, they must develop anti-churn strategies.

Combating churn in subscription-based businesses is a critical problem. The typical Internet service provider, for example, loses four to eight percent of its subscribers monthly, or up to 96 percent annually. Telecom companies and deregulated energy utilities face enormous losses when customers switch. Much of the model building to predict switching customers focuses on usage data, demographic information and other variables likely to be stored in the enterprise's data warehouse. Even if the quantity of the data is massive, is it relevant?

It's very dangerous to assume that the relevant data is already part of the database. Many organizations simply correlate the data available in their databases with switchers versus nonswitchers to produce churn models. If the assumptions about data relevancy are incorrect, the data mining exercise will be a waste of time.

The safest way to proceed is to begin discussions with management to ascertain the exact business criteria. It is most important to evaluate what management wants to do in the market. Segmentation will only work if the results mirror how real customers behave in the real marketplace. After determining the business criteria, analysts should turn their attention to data criteria.

For example, in the case of a subscription business with a churn problem, how much time is needed to act before the customer leaves? If the response time for remedial action is two weeks, is there the relevant data to predict this two-week lead time? What actions must take place within those two weeks? What characteristics are known to be predictive?

If there are no satisfactory answers to this last set of questions, market research may be necessary. Either through focus groups, survey research or a combination of the two, various reasons for switching need to be isolated. Reasons for switching might include perceptions of unsatisfactory service or better deals from competitors. In the former case, customer satisfaction ratings or an increase in complaint rates can indicate this problem. In the latter case, competitive activity in a customer's area can be included in the customer file.

Using survey research data to determine critical variables is a key strategy I have employed. In the case of segments based on churn, for example, one can choose three samples: households who have remained with the company, others who left and then returned, and those who have left and never returned. Through a well-structured survey, one can determine the demographics, behaviors and triggers that prompt switching. At this point, modeling is performed only on the survey data. Once switching behavior is understood and the critical variables determined, the CRM mart or database marketing file is adjusted to implement the new models.

Often, critical variables are not part of a database because they are rare occurrences, such as complaints, or are ad hoc issues such as attitudes about products. More often than not, these unpopulated variables are the keys to predicting management actions. The way to handle these variables is to determine their relationship to typically collected data, such as demographics and behaviors, which are then used as surrogates in the CRM data mart.

Figure 1 shows how third-party demographic and behavior data can be merged into a utility's database and how segments based on other data can be mapped on the same database analytically. Using a variety of data including that collected directly from customers, acquired through third-party providers and inferred through analytic processes will result in the most powerfully predictive data marts.

Figure 1: Merging Data and Mapping Segments for Predictive Analysis

As important as it is to have the most up-to-date hardware and software technology, money will be wasted if the data is irrelevant and the strategies for using that data are unsound. The cliché, "garbage in, garbage out" has never been more true.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access