Continue in 2 seconds

We are in the process of implementing an ERP solution (SAP).

  • June 04 2003, 1:00am EDT


We are in the process of implementing an ERP solution (SAP). We would like to formulate a data management framework for the project and company. Any assistance and guidelines will be appreciated.


Larissa Moss’ Answer: ERP implementations are prime opportunities to improve the quality of your operational data. Several years ago, too many companies rushed through their ERP conversions (largely driven by the Y2K deadline) and propagated their source data quality problems into their new ERP environment. Don't make that mistake! Before embarking on the source-to-target mapping activities, perform some source data archeology and do some data profiling. Look for the following data domain and data integrity violations:

  • Do you have missing data values in your source data elements?
  • Do you have a lot of default values (999999, zeroes or blanks)?
  • Do you have "intelligent" dummy values (example: income 999,999.99 indicates the person is an employee)?
  • Do you have redefined fields (as in COBOL redefine statements)?
  • Do you have reused fields (example: Mstr-Code A, B, C, D, E, F where A, B, C describes type of customer and D, E, F describes type of product)?
  • Do you have reused primary keys (example: an old branch number reassigned to a new branch)?
  • Do you have non-unique primary keys (example: a customer with three different customer numbers on his/her three different accounts)?
  • Do you have a lot of cryptic codes (such as 1, 2, 3, where 1 means checking account, 2 means savings account, 3 means mortgage loan account, etc.)?
  • Do you have illogical or contradicting values in dependent fields (example: date of death preceding date of birth or maximum interest rate being lower than minimum interest rate)?
  • Do you have logic embedded in the data (example: first three digits are the department number)?
  • Do you have "free-form" address lines 1, 2, 3, 4, 5?
  • Do you have inconsistent data values in dependent fields (example: state is New York but ZIP code is 75912 from Texas)
  • Do you have data that violates business rules (example: senior citizen discounts for persons under 65 years of age)?

Use a data profiling tool or write short programs to examine the data values in your source files and databases, and determine the data quality problems. How many records have bad data? How critical are those data elements to your organization? How clean does your business community require the data to be? Be sure to talk to downstream information consumers and not just the data originators about their expectations of data quality. Then classify your source data elements into three categories: critical data (must be of pristine quality), important but not critical data (should have less than x percent bad data values, x to be determined by your business community), insignificant data (used for informational purposes and can contain wrong values without having an impact on the organization). Be sure to cleanse all of the critical data and as much of the important data as you can before or during your ERP conversion. Monitor the data quality in your ERP databases on an ongoing basis by setting up automated data profiling jobs to run monthly or quarterly audits. Establish a procedure to continuously correct dirty data before it has a chance to contaminate downstream decision support applications.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access