JAN 1, 2006 1:00am ET

Related Links

10 Sustainability Predictions for 2011
February 23, 2011
A Letter to Future Employees: Embrace Analytics
February 3, 2011
A Hunger for Risk
January 6, 2011

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Enhance and Expand BI with Mobile
Available On Demand

The Perfect Match - Seven Steps to a Master Data Match, Part 1

Print
Reprints
Email

The following article is excerpted from the white paper, "The Perfect Match: 7 Steps to a Match" written by Cory Shouse. Visit http://www.csiwhq.com/news/whitepaper_requests.asp for a copy of the full paper.

Master data management (MDM) is a series of processes put in place to ensure that reference data is kept up to date and coordinated across an enterprise. A solid MDM approach can have impacts on the business and quality of your systems that make it well worth the investment; however, it is no trivial undertaking. Perhaps one of the most complex and underestimated tasks associated with master data management is matching. Matching is at the core of MDM. For our purposes, let's define a "match" as finding similar customers, vendors or products in multiple disparate systems. Preparing for the match, performing the match and perfecting the results of the match can significantly increase the quality of the reference data within your organization.

In a simple example, Customer A may be represented in two disparate systems. In system 1, the customer may be known by its trade style (i.e., Paws-N-Claws Pet Sitting); however, in system 2 it may be known by its legal name (i.e., PNC, Inc.). If we want to get a good understanding of Customer A's spend across both systems, then we must have a way to match the trade style with the legal name to identify this customer as the same entity.

The value of understanding this relationship goes beyond business intelligence (BI) solutions. Enterprise resource planning (ERP) systems, customer relationship management (CRM) applications and other IT applications can all benefit from a dependable matching approach.

Preparing for the Match

Preparing for a match involves investigating the conditions of the source data, standardizing common elements and, when available, finding a partner to leverage as an external master reference.

Step 1: Clean Your Data

Clean data increases the probability of a match. Attempting a match with dirty data is similar to searching for a friend on Google by using only his or her first name. You may get results, but culling the numerous pages of results will take time. To ensure the same does not happen with MDM, take the time to investigate and understand your data source(s). Investigation of data can range from a simple count of the nulls by column to a complex statistical analysis finding tendencies in the data. A good picture of the primary keys, relationships, constraints and domain values (distincts/ranges) will later help define the rules for the match.

Step 2: Standardize Common Elements

The next step in preparing for a match is to identify and standardize common elements that will be leveraged for matching. In order for the elements to be standardized, they must exist in all files you are trying to match. These elements will be the core set used in comparing across the files. "Customer" is often the first subject area that companies attack due to the high value associated with acquiring a single view of the customer. In the case of a customer, seven elements are often leveraged as the standardized columns. These elements, known as the "Super 7," include:

  1. Business Name,
  2. Address,
  3. Industry Classification,
  4. SIC Code,
  5. Sales,
  6. Number of Employees, and
  7. Phone Number.

In the case of item/product, the standard elements might include SKU (for internal systems) or UPC, ISBN and EAN for external matching. In addition, product description, weight or even generic name can be included as required columns for matching.

Step 3: Identify External Reference Source

Finally, where available, identify an external source that can provide a common industry key. Continuing with the customer example, a Social Security number for consumers or DUNS number for companies is common. In the case of a product, it can be more difficult, as products are often unique; however, many industries have established standards. For example, the FDA created the national drug code (NDC) number to uniquely identify pharmaceutical drugs. Some important questions to consider as you look for the right partner include:

  • How often is your database updated?
  • What are the sources for the data?
  • What is coverage of the file for the specific set of data?
  • What formats are available?
  • What other value-added elements are available?

To achieve an optimal match, it may be necessary to attempt matching against multiple vendors. Provide the vendors a set of your data and see which one provides the best match. Leverage that vendor as your primary, and use the secondary vendor in cases where no match was found in the first case.

William wishes to thank Cory Shouse for his contribution to this month's column.

Cory Shouse is a senior architect with Conversion Services International. With more than 10 years of experience in business intelligence, Shouse specializes in helping companies establish, organize and deliver value to the business. He has assisted a number of Fortune 500 companies define quality assurance programs, organizational and staffing plans, change control procedures, and appropriate information and technical architectures. He may be reached at cshouse@csiwhq.com or (469) 939-5385.

In the next column, we will cover performing the match with steps four through six.


 

William McKnight brings process, organizational and architectural focus to building strategies and implementing master data management and data warehousing programs that have consistently, for many years, improved the productivity and performance for his clients including several global corporate giants. McKnight, award-winning consultant and author, can be reached at McKnight Consulting Group.

Cory Shouse is a senior architect with Conversion Services International. With more than 10 years experience in business intelligence, Shouse specializes in helping companies establish, organize and deliver value to the business. He has assisted a number of Fortune 500 companies define quality assurance programs, organizational and staffing plans, change control procedures, and appropriate information and technical architectures. He can be reached at (469) 939-5385 or cshouse@csiwhq.com.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.