BACKGROUND: Nortel Networks is a global leader in telephony, data, wireless and wireline solutions for the Internet. The company had 1998 revenues of $17.6 billion and serves carrier, service provider and enterprise customers globally. Today, Nortel Networks is creating a high-performance Internet that is more reliable and faster than ever before. It is redefining the economics and quality of networking and the Internet through unified networks that promise a new era of collaboration, communications and commerce.
PLATFORMS: Intel, NT.
PROBLEM SOLVED: We manage a large suite of databases and were looking for a tool that would allow us to join data from multiple tables by matching a field in each record, even when the join fields weren't exactly identical, such as "Brigs Inc." and "The Briggs Corp." DataFlux has a suite of software tools called dfPower that contains a standardization module and a match module. The tool we used to solve our problem was the match module that allows us to append a fuzzy logic-based matching key for each record in each table. This match key is created with user- supplied parameters depending upon the types of data being matched. We can create the match key based on a single field or on multiple fields. For example, we can create a match key on both company name and city or just company name, depending upon our requirements. By running the match module, we were able to write a corresponding match key as an additional field for each record directly into the database (we could have created a new table to store the match key as well). We were then able to create new tables with the SQL "JOIN" statement and write queries across multiple tables. This software has allowed us to look at our suite of tables and reevaluate the functionality of our data by enabling us to build new databases or write queries based on multiple tables. The dfPower Series also includes a standardization module, which helps us to standardize any inconsistently represented data within our databases and provides many facilities to help us do so.
PRODUCT FUNCTIONALITY: Instal-lation is quick and painless, from downloading on the Web, evaluating the trial software, getting the serial number and running the match module. Installation, connecting to the database and selecting user- defined parameters is intuitive. Since the product contains ODBC drivers for more than 30 different database platforms, it was very easy for us to connect to our various data sources. Setup allows you to choose the working directory, set the degree of matching (fuzzy logic from five percent to 100 percent matching sensitivity, as well as choosing the number of characters to match if required), set your index key (if required) and assign a match key field within the database (or have the software create one for you on the fly). Running the match module is straightforward and processes about 400,000 records an hour (based on a dual Pentium 500).
STRENGTHS: As a suite of database management tools, dfPower Series is flexible, intuitive and can be used for a variety of tasks including data cleansing, matching and data standardization. The match key utilizes a kind of phonetic matching and treats common variations of words the same, such as "ltd" and "limited." This new category of data management software will allow many companies to get significantly more value from their data.
WEAKNESSES: The licensing procedure of getting a serial number unique to each copy forced us to contact DataFlux every time we installed our software on a new machine. Not a big drawback, but every time we put this on a faster PC, it was a minor inconvenience.
SELECTION CRITERIA: We chose DataFlux because at the time they had a unique product on the market and had good references. The pre-sale support along with the evaluation product and data samples helped us to visualize how the process would work and what the product could and could not do. They also were rolling out a new version of the product at the time that allowed us to do what we needed to do.
VENDOR SUPPORT: The support offered was good, and the support staff explained things very well. We also purchased a maintenance contract which entitles us to revisions.
DOCUMENTATION: Extensive documentation was provided in .pdf format. Both the standardization and matching .pdf files seem to be complete.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access