CATEGORY: Data Quality

REVIEWER: Shannon L. Mabry, lead DBA for Network Associates.

BACKGROUND: Network Associates is the world's largest independent network security and management software company and the eighth largest independent software company overall. Network Associates is the culmination of best-of-breed technologies from the world's leading software developers. These leading brands are used by Network Associates' more than 60 million customers around the globe and include McAfee anti-virus, PGP encryption, Gauntlet firewall, Magic Help Desk applications and the Sniffer family of network analyzers.

PLATFORMS: Microsoft NT and Microsoft SQL Server.

PROBLEM SOLVED: We were searching for a cost-effective, flexible and user-friendly solution to improve the accuracy, consistency and usability of corporate customer data. We needed a more efficient way to clean duplicates and cross reference a computer intelligence (CI) relational database with our database of corporate customers for enterprise reporting purposes. We solved this problem by using the dfPower Studio Match and Standardization Modules from DataFlux. Initially, we did a big push to get all of our customer enterprises identified utilizing the match codes. The match code is a unique identifier that can identify similar information in a database so that when a report is generated, you are actually getting all the information that you are looking for. We were able to take different fields of our choosing and create a match code that represents how a given customer's name sounds and then compare that to our other data stores. Once identified, the Standardization Module converts all the variable instances of a company name to one standardized representation. We are now able to get a good understanding of who the customers are along with accurate counts. In the past, we ran specific SQL statements to reduce duplicates, but that process only worked well with an entire data field and did not provide the flexibility to work with data at the element level. DataFlux dfPower Studio enables us to cleanse and integrate data at the phrase level, which also enabled us to cross- reference data with the CI database containing additional information about customers and prospects. This type of detail is very useful for the sales and marketing groups.

PRODUCT FUNCTIONALITY: From my workstation, I was able to download and install the free evaluation version from the Web, which initially helped me identify any issues that I had with my data before purchasing the software. It is all point-and-click driven and is very intuitive. The Match Module is very flexible in allowing us to work at the phrase level and set the degree of sensitivity for our matching. In the future, I'll want to use the dfPower Verify Module to perform address verification which is CASS (Coding Accuracy Support System) certified by the United States Postal Service.

STRENGTHS: The DataFlux dfPower Studio suite of data management tools is fast, flexible, affordable and usable. It is the only out-of-the-box solution we found that didn't require any ongoing technical or consultant support to install, customize or utilize the tools. From my desktop, I can access our data, cleanse, match, standardize, augment it and ensure accurate reporting. Our typical routine takes only 15-20 minutes to run through our database of over half a million records.

WEAKNESSES: In the future, I am looking forward to DataFlux enhancing its offerings to work well with extended character sets. This will allow for additional international support and standardization for other countries. I understand that DataFlux is currently working on this.

SELECTION CRITERIA: We were looking for a solution that we could install and use at a manageable cost and that provided fast and accurate results. The fact that we were able to download and install the product and see how it worked on our data before we bought it was a big factor.

DELIVERABLES: We analyze the data from the databases. From the analysis, we can find all the issues that exist, summarized in table form. We export that information to a spreadsheet for review. Once our data group reviews the spreadsheet, we update our databases.

VENDOR SUPPORT: Support from the vendor has been great. The only time we had difficulty getting a quick response was during their recent office move. Working with our salesman, Richard Crawford, is more like working with a technical guy than a salesman.

DOCUMENTATION: Extensive documentation was provided in PDF format. I have found that any time I picked up the phone to resolve an issue, it could have been resolved by reading the documentation.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access