Continue in 2 seconds

Technical Measures for Data Quality Investments

  • May 20 2008, 5:21pm EDT
More in

This column looks at technical measures of data quality for the cases presented in my May column. Profit per customer case. An automobile dealer made service history available to salespeople while a new car purchase was being negotiated. The business value came from targeted offers to increase use of the highly profitable service department. Technical data quality measures include:

Speed of access. This is the time it takes the salesperson to retrieve data for a customer. Multiple queries may be needed before the system returns a satisfactory result, and salespeople will not bother if it takes too much effort. Elapsed time would be gathered from system logs.

Match rate. This is the proportion of successful matches returned by the system. There are separate statistics for correct matches, false matches and missed matches. Match accuracy is often difficult to measure because the correct answer is not known. But in this case, the customer will know whether she has previously used the service department. The salesperson should therefore know when to keep trying until the system returns a match. This means the most important measure is “correct results returned on the first try,” as shown by the number of successful single-search sessions. Successful searches are followed by a request to view the underlying data. Abandoned searches are not.

Service data quality. This includes all quality components - accuracy, completeness, consistency, currency and suitability to task. Because the service history is derived from the service department’s billing system, it should be reasonably accurate, current and complete. This would be confirmed by the company’s normal auditing functions. Consistency is measured by profiling the data over time to identify unexpected values or value distributions. Profiling can also detect improper or fraudulent billing - something the service manager may or may not be particularly eager to explore.

Suitability to task is a particular challenge because the data is being used for something other than its original purpose. The system must summarize the raw service data to show aggregate purchases, changes in usage patterns, types of work (e.g., all routine maintenance or only major repairs) and inferences about customer needs (high mileage, off-road travel, heavy loads, etc.). Summarization depends on the core data quality measures of accuracy, completeness and consistency.

Even summarized data can be difficult for a salesperson to interpret, so the system should also recommend a best offer. Recommendation quality is measured by tracking how many recommendations are presented by the salespeople, how many of these are accepted by the customers, and their long-term impact on customer profitability. Presentations and acceptances can be measured directly so long as salespeople record their results. Long-term impact requires tracking customers over time.

Similar technical data quality measures apply to the other three cases discussed last month. Briefly:

Promotion effectiveness case. This was a project to improve accuracy of a packaged goods manufacturer’s lists of distributor contacts. The business value was better execution of retail promotions. Technical data quality measures include:

List accuracy. Determined by random telephone calls to the distributors to verify the names on the existing lists. Returned mail and rejected email addresses may also provide information.

Update speed. Determined by tracking how often the sales force provides list updates. This will identify salespeople who are not participating.

Value per response case. This described an online marketer’s project to reduce bad debt and improve product recommendations through better real-time access to customer history. Technical data quality measures include:

Match rates with internal systems. Measures include the percentage of successful matches, the percentage of confident matches (using a system-generated confidence score) and the percentage of multiple matches (more than one customer record matches a single input). Here, independent validation of match accuracy may not be available.

Match rates from external sources. Confidence scores may not be available, so the only measure is the match rate itself. Some verification is needed to measure false matches - a particular issue with external vendors who are paid on the number of hits.

Quality of results from internal systems. Completeness is measured by the scope of data provided: purchases, payments, returns, refunds and service interactions. These may originate in several different systems. Currency is measured by how long it takes a new transaction to become available. It can range from milliseconds to a month. Important measures not related to data quality include response time and prediction accuracy.

Return on promotion case. This describes direct response marketers who use lifetime value to optimize promotion spending. Technical data quality measures include:

Cost data. Accuracy, completeness and consistency. The most important measure is percentage of missing values, because many marketers fail to record the necessary information in the marketing system. Another key measure is variation between the marketing system and the accounting system, because entries in the market system may not be revised to reflect actuals.

Customer integration. Accuracy and completeness. Records for the same customer are set up independently in several systems and then merged. Measures of incomplete merges include refunds without a corresponding purchase and repurchases without an initial order.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access