Almost every IT professional has experienced a data conversion nightmare. Data conversion projects encompass a little bit of everything - requirements gathering to understand the source and target data, some database modeling to better understand the data, data cleanup work to refine the data and custom development in the form of transformation tools. You have the classic quality/scope/cost trade-off decisions that need to be made in project management.

In a perfect world, all organizations would have enterprise data models and conversions would not be necessary. Because this is rarely the case, data conversions are a great opportunity to resurrect and reinforce all business rules. However, data assessment and data cleanup efforts can be time-consuming and expensive - two things that will make it difficult to include in the scope of the data conversion project. The project sponsor and the users should be aware of the risks associated with omitting this step. It doesn't make much sense to convert poor quality data because nothing can alienate users more than poor quality data. Ultimately, the business case for a data assessment will depend on the level of data quality necessary to support the business purpose of the application.

With technology and labor costs trending lower and demands on software systems intensifying, data conversions to move data from legacy systems are becoming more necessary. These data conversions are almost always more complex than people realize, and the effort involved is usually underestimated. Here are some ideas to make your data conversions less painful.

1. Build the Right Team. The project team needs to have the right blend of skill sets to complete the project. Consider a business analyst to understand the requirements, a DBA to build the target model and to write the load procedures and an ETL analyst to write the transformation scripts if an ETL package is involved. (See idea number two.) The strengths of one resource can compensate for the weaknesses of another, but it will increase your chances of a successful conversion if your technical people have some business knowledge and awareness. In addition, user participation is essential so the requirements and the resulting data model accurately reflect user needs.

2. Choose the Right Tools for the Job. Microsoft SQL Server and Data Transformation Services (DTS) are powerful. Stored procedures combined with DTS packages can be powerful and portable, and pricing is reasonable. Microsoft Access is even more prevalent and could be appropriate for more straightforward conversions. Whatever tool or tools you ultimately select must be easy to automate. If the volume of data is large and the transformations are complex, an ETL tool and/or a data quality tool may be required. Here's a short list of factors to consider in the tool selection process:

  • Amount of data to be converted
  • Complexity of transformations
  • Source and target platform(s)
  • Existing licenses
  • Costs
  • Availability of tools considered
  • Skill sets of conversion team
  • Time frame
  • Applicability of licenses to post-conversion environment

3. Build the Proper Infrastructure. Part of making a data conversion a "real" project is building the proper technical infrastructure for the project, which should include the following: backup routines to save work in progress, adequate security, a staging environment, a development environment, a UAT environment and area for users to view converted data.

4. Control Project Scope. The scope of a data conversion can take on a life of its own. Make sure the project sponsor and the users are aware of the cost of converting certain data elements and the additional risks associated with these elements. Some decisions need to be made as to what data is essential and what isn't. Does it make sense to keep the legacy system running for a short period of time to reduce the scope of the conversion?

5. Stabilize the Requirements and Target. Stabilize the requirements and the data target data model as early as possible. That said, you need to assume things are going to change, and your process needs to be flexible to accommodate these changes. Define the data model in a tool if possible and make sure all requirements are well-documented. Keep your database model and the script to generate it current so the conversion team and developers are always working with the same model. The requirements and target data model must be viewed as a dependency for the conversion effort.

6. Design a Repeatable Process. Design the process and budget your time as if you are going to need to convert the data a hundred times. A conversion is not a development effort, but it should be approached with the same care and attention devoted to application development projects. As many steps as possible should be automated (i.e., stored procedures in SQL Server or macros in Access). At some points during the conversion process, manual intervention might be necessary to check interim results, just like a baker might occasionally stick a toothpick in a cake to see how it is progressing.

If the conversion process does not lend itself to automation, at the very least there should be a well-defined process with a checklist that can be used as a guide during the conversion. The window to do the actual cutover is often narrow, and the only way to minimize system downtime is to automate as many steps as possible. The coding does not need to be elegant, but it should be pragmatic. Repetitive tasks should be modularized with subroutines, so changes (if there are any) only need to be made once.

Treat the conversion as a performance and the preparation as rehearsals. Carrying this analogy further, you should have a full dress rehearsal to ensure that your process works end to end. The conversion process should include documented steps with time estimates for each task and step-by-step instructions of what you are going to do. While the ratio of preparation and testing time will vary depending on the complexity of the source data, the target data, and the number and complexity of the transformations, it could be as high as 40 to 1 (40 hours of preparation for each hour of conversion time). This investment might be hard to stomach because most of the work is throwaway; the conversion routines have little value once the data is converted and updated in the new system. Because automation reduces risk and testing and ensures that you have all the transformations correct, you're doing it a lot more than once. Because there's also a good chance that your go-live date will be delayed, you need to have a process that's repeatable in case it needs to be transitioned to a new resource.

7. Build the Lookup Tables First. In the process of building the target environment, start with the lookup tables. These will be the foundation of the system. One way to address errant values is to use start and end dates for all the lookup tables so values are permissible for the conversion, but cannot be used once the system is in production. As with any automated process, code should be modular whenever possible. Break the conversion process into logical chunks.

  • Map the source data.
  • Load source data into staging tables.
  • Cleanse staging tables.
  • Populate target tables with source data.
  • Transform target tables.
  • Prepare for transfer to target system.

It might make sense to further modularize the target table routines into one module for each resulting target table.
8. Decide When and Where to Enforce Referential Integrity. Importing data into a physical data model that enforces referential integrity can be challenging and frustrating. Referential integrity will need to be turned off in the staging area, but don't wait until conversion day to figure out your approach to loading data in a fully constrained target environment. One technique is to work in a testing area and add constraints one by one. Address the key violations as they occur. Ideally, do as much checking in your staging area as possible, because this is the environment in which the error messages will be most informative.

9. Document Your Work. Document your work and keep it for posterity. At a minimum, there should be a step-by-step approach as a guide for the conversion. In addition, documentation should also include an explanation of where each of the source data fields was mapped and how each of the target data fields was derived. After the conversion, the help desk will need to support and maintain the application and potentially explain to users how the data was derived. It's much easier to document this as you go rather than at the end; these documents should be created while the process is being refined. The other added benefit is that documentation facilitates transitions to new resources if required.

10. Test the Process Repeatedly. During a data conversion, there can be so many detailed steps and transformations that it's easy to make mistakes. Testing is essential in order to catch errors before go-live. Once the data is in production, the effort required to make corrections is much greater. If the processes are automated, testing will be more effective and efficient. If errors are found, go to the source to fix issues discovered rather than fixing at the point of discovery. Address the fix as early in the process as possible so the transformation or the cleansing activity addresses the root cause of the problem rather than a symptom.

Conclusion

Data conversion projects can be quite painful, and it is a rare data conversion project that is executed without a glitch. Unfortunately, there's really only one way to get things right, but an infinite number of ways in which things can go wrong. Hopefully, these steps will help guide you and identify these potential risks. While there's no guarantee that following all these steps will result in a defect-free conversion, they can help you proceed in the right direction. Data conversions preserve the value of your organization's information asset and need to be treated with the same rigor as other IT projects. 

Data Conversions Development and testing of your mission-critical CRM implementation is complete, and you thought you were ready for rollout, but your project manager reports that the data conversion is weeks behind schedule. Sound familiar? Here are 10 steps to guide you through a successful data conversion project.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access