Classifications such as gender make a difference in what medical diagnoses and procedures are valid for a health provider and whether reimbursement is appropriate by an insurance company. Order type classifications in retail and wholesale make a difference as to what pricing rules apply. Get the classification wrong, and you invoice for the wrong price. Country code classifications dictate which address format is appropriate for international mailing. Millions of pieces of mail go astray because of incorrect address formats.
Quality Problems
Several quality problems confront classification information:
- New classification options become required, such as a drop-ship order type, but the code is not added to the order-type table. Drop-ship order records are not able to be classified as such.
- Obsolete classification code values can be inadvertently selected for a record. For example, frequent shopper classifications may change from four classifications (diamond, platinum, gold and silver) to three (dropping the silver). Two potential quality problems:
1) If referential integrity (the requirement that a frequent shopper customer must have a valid classification code) is not enforced, customer records may be left with a classification of silver frequent shopper, but no program associated with them. Or,
2) If frequent shopper classification is selectable by information producers, customers may be classified into the no-longer-valid silver classification. - If there is no business subject matter steward in charge of controlling updates to classification values, inappropriate or conflicting code values may be created.
- Classification code values that are created outside of your enterprise, such as ISO country codes, may not be available quickly enough for your need. If not, you need to use an existing, incorrect code or create a temporary code value with the following problems:
1) If you use an existing value for a different classification, such as the country classification code for the former Yugoslavia (YUG) for addresses in the emerging Slovenia and Croatia, you have an incorrect address. When the codes for Slovenia (SVN) and Croatia (HRV) are finally assigned, you will need to sort out which addresses belong with which new country.
2) When the new assigned code becomes available, you will need to update all records containing the temporary code. If you don't, processes using the invalid codes will fail.
Quality Management Techniques
Outline a controlled process for defining information and creating valid codes. This should include:
- Assign a business information steward and an information resource management specialist to validate business definition and information design principle integrity.
- Create classification attributes that represent a single kind of categorization of objects or events. For example, the attribute Product Line Code should not classify both product line and sub-line as one code.
- Clearly define the meaning of the classification.
- Clearly define the meaning of each code type as a single, nonoverlapping classification. For example, Frequent Flyer classifications should not have mileage ranges that overlap.
Silver = from 25,000 to 49,999 miles
Gold = from 50,000 to 74,999 miles
Platinum = from 70,000 to 99,999 miles
Diamond = 100,000 miles and over
(Note that the mileages for gold and platinum overlap.) - Establish effective end dates for the classification definition and business rules for each classification definition. These can change and are required to analyze any apparent anomalies when comparing data across time.
Business information stewards and information producers creating classification codes must understand how they are used across the enterprise and must assure accuracy and completeness in the set of valid values. They must keep them current when changes occur.
Train the information producers who create records with classifications, so they know how to properly classify objects or events.
Design edit and validation tests to prevent inadvertent errors or to automatically classify the objects or events where all required data is known, such as the aggregation of mileage of frequent flyers.
When millions of customers require multiple classifications, from personal title to gender to frequent shopper, flyer or guest type, to state or province and country codes, classification information (reference data) requires zero defects.
What do you think? Let me know at Larry.English@infoimpact.com.
Larry P. English is president and principal of INFORMATION IMPACT International, Inc., Brentwood, Tennessee, and the author of the widely acclaimed book, Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. English is cofounder of the International Association for Information and Data Quality (www.iaidq.org). English is an internationally recognized speaker, teacher, consultant and author and may be reached at larry.english@infoimpact.com or through his Web site at www.infoimpact.com. For more on how to improve your IQ principles and techniques, and prevent your organization from wasting millions in information scrap and rework, join the IAIDQ (visit www.iaidq.org).










Be the first to comment on this post using the section below.