BACKGROUND: Interact Management Consultants Pty Ltd develops and customizes customer information systems for banking, finance and other industries in the Asia Pacific region ­ more than 300 projects for over 100 clients. Interact also builds solutions aimed at migrating from account- based systems to customer-based systems.


PROBLEM SOLVED: An Australian state government agency with multiple departments (over 12) commissioned the development of a prototype authoritative address file with the aim of providing greater levels of address accuracy and detail for all users. Currently each state department maintains their own address file; and to test the prototype, input address data was provided by four departments. Each input file was in a different format and contained varying levels of address information (e.g., one source provided property-level information ­ one record per multi-unit property ­ while others carried several records for that one property ­ one for each unit). These source files were converted to one standard format as part of the scrub process in order to successfully match against each other. The matching process then needed to accommodate the differing levels of address data provided by the different sources.

PRODUCT FUNCTIONALITY: Utilizing the rule sets for addresses already in the INTEGRITY product, we customized and created a new rule set for Australian street addresses. With this new capability using the Windows-based Workbench of the INTEGRITY Data Re- engineering Environment, we built and maintained procedures for data investigation, standardization, matching across data values and reconciliation across records. Using the prebuilt procedures, we parsed and analyzed free-form fields and constructed rule sets for matching, linked related records lacking a common key and constructed new records based upon data value reconciliation and consolidation rules. We utilized INTEGRITY's data cleansing, parsing, lexical analysis, probabilistic matching and comprehensive data typing. In the future, we will consider utilizing the INTEGRITY Callable Libraries for real-time data defect and anomaly prevention.

STRENGTHS: INTEGRITY's investigation module is helpful in providing pointers to problem areas in standardization. The rule set customization enables the user to control how the data is conditioned and standardized. The interface is easy to use, and the prebuilt procedures are easy and quick to build and run. The survivorship capabilities on the matching are extremely useful, particularly the ability to migrate missing field contents from "loser" records in order to build the "best-of-breed" record.

WEAKNESSES: Presently some coding is needed to put the data into the input format that INTEGRITY expects. While the matching capabilities are robust, the matching process can be complex and could be simplified.

SELECTION CRITERIA: The customizable rule set was a key criteria ­ we could create a consolidated address file in a short period of time. This capability was critical and not possible with some of the more "black-box" types of implementations. The robustness of the product in the areas of investigation and matching were also key for our future requirements.

DELIVERABLES: INTEGRITY delivered reports that analyzed existing data to assess its actual condition and to identify the content of data fields and statistics relative to the population of the content. INTEGRITY transformed, standardized and reconciled the data to very high accuracy rates.

VENDOR SUPPORT: We had on-site education. Because the instructor was very experienced with the product, he was able to flexibly change the class to address our particular needs; and we actually started developing our requirements during the class.

DOCUMENTATION: The INTEGRITY Getting Started manual and user guide are complete and helpful. The Workbench provides Windows-based, context- sensitive help that guided us through the major processes of data investigation, standardization, matching and reconciliation. However, the documentation could be improved with the inclusion of "starter" outlines with parameters and values for common uses. In addition, there presently is no troubleshooting guide.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access