The only constant in IT is change. There are the changes that organizations want to make and the changes they have to make. Regardless of whether the change is caused by a planned, strategic decision or an outside factor such as a merger or new government compliance requirement, most enterprise application changes will involve the migration of data. Recent consolidation in the enterprise application market, led by Oracle's aggressive acquisition path, will even further accelerate the rate at which many organizations rethink the future of their current enterprise application platform and consider a migration.

While each data migration effort is unique, there are many common features that allow a discussion of best practices to benefit those who are planning or are in the middle of a data migration effort. This article discusses proven strategies for success and identifies ways to get the most out of a data migration.

Data Migration Defined

Data migration is the process of transferring data from one repository to another. The repositories in question usually have different storage types, formats and constraints. For the purpose of this discussion, data migration is usually performed programmatically in an attempt to achieve automated migration. It is required when organizations change enterprise applications, upgrade to new platforms or consolidate systems. Due to data dependencies and information hierarchies, a migration needs to be performed in a specific sequence in order to maintain data integrity. Common steps for data to be migrated involve the extraction of data from the source, transformation of data into a format that the target will accept, validation of the transformed data values against the target, mapping of source data fields to target data fields and loading data into the target.

"He who fails to plan ..." is an old cliché but so true when it comes to data migrations. It is estimated that 50 to 60 percent of the time spent on enterprise application implementations involves some aspect of data migration. During this time, significant costs and project resources are consumed by the laborious, repetitive and often complex task of data migration. Without planning for the data migration, the success of the entire project is at risk. To avoid becoming one of these sad statistics, some experienced system implementers suggest managing the data migration as an entirely separate project, allocating to it experienced staff resources and a budget.

It is also advisable to start work on the data migration as early as possible, because too often it is left as an afterthought. When the scope and complexity of the migration become apparent, it is too late to prevent delays in the overall project. An ideal outcome is one where the entire data migration has been automated and tested well in advance of the go-live date - that way, the tested migration processes can be run just in time for the new system to go live, including the latest possible revisions to the data in the source system. Being able to estimate the time that needs to be set aside for the final data migration can also prevent unforeseen delays close to the go-live date. Doing test migrations in advance will help estimate the actual duration, although one should always allow some time for anomalies that likely will occur with untested data.

Figure 1: Data Migration Sequence

Avoid GIGO (Garbage In - Garbage Out)

No matter how good the IT department, how strict user security and how prudent the database administrator (DBA), users always manage to insert bad data into a system. When faced with a migration, your goal should be to avoid loading the garbage in your current (legacy) system into the new system. If proven tools and automation processes are applied to perform the data migration, errors in the source data can be identified and either set aside for correction or corrected on the fly before insertion into the new system. Identifying possible flaws in the data and defining processes for dealing with the errors takes time and resources that are often not in the budget. It is important that this aspect of the data conversion process is part of the plan.

Some system integrators have data quality experts and offer a professional data quality inspection service that makes use of tools to profile data and create reference repositories that contain only valid data. This type of service also takes time and money, but it can save many headaches during the actual data migration and transition to the new system.

The Curse of History

There is a financial cost to migrating data, which means there must be a compelling benefit or reason for doing it. In the case of current data, the reason to migrate is obvious: the data is needed in the new system to ensure continued support of business operations. But how much historic data should be converted, and how much of it will ever be used or even looked at in the new system? Consider the available options, and be aware of any external requirements that may influence your decision. The Sarbanes-Oxley Act, for example, may impact your decision to some degree, as may countless other state and federal regulations. For instance, some industries require a minimum of five years worth of human resource data be maintained, so this historic data cannot be ignored. Experienced system integrators may recommend different approaches to "maintaining historic data." Migrating it is not always necessary, and the cost/benefit of other approaches should be considered. Be aware that some modern enterprise applications were not architected to allow for simple conversion of data. Your efforts in this regard could be significantly more complicated.

One approach is to only migrate data that is considered current and necessary to support present and future business operations, while leaving the historical data in place. This means that the old system (or at least the database) needs to be maintained for some period of time. There is a cost attached to this, which should be weighed against the cost of migrating the historic data. With two systems hosting your current and historic data, reporting capabilities across two systems are likely to be required. This also carries a cost and should be taken into consideration. Over time, the historic data may no longer be required and can safely be archived.

Another approach was borne out of the limitations inherent in some modern applications. The recommended approach for loading data into a new system is usually to go through the supported application programming interfaces (APIs) or interface tables. Quite often these interfaces are limited in how they can be used and do not support the loading of historic data. If this approach is not appealing or is considered too expensive, a second approach is to create a custom database schema in the new system and migrate the historic data over to this custom schema rather than the new application schema. This way, the historic data and the current data are maintained in the same database but in two separate schemas. For simplicity, the custom schema can closely mirror the legacy application schema in order to avoid having to perform complex conversions of the historic data. With this approach, reporting capabilities that can report on data across two schemas will be required in order to obtain a unified view of current and historic data.

Productivity Tools

Two of the main reasons for data migration efforts failing and causing significant delays to projects are the lack of skilled resources and script spaghetti. These are exactly the problems that productivity tools and proven methodologies are able to address.

In order to move data from one system to another, there is a requirement for technical resources with a thorough knowledge of the data structures in both systems. In most cases, knowledge of the source system can be found in house, but companies have to bring in consultants for the target system expertise. Most consultants are experts in certain modules such as purchasing or inventory; if you are doing full-scale migration covering several modules, you will need a small army of internal and external resources to get the job done. Without productivity tools designed specifically for data migrations, this army usually sets off developing scripts.

Unfortunately, a large number of increasingly complex scripts can create a maintenance nightmare. Your initial set of scripts may be well documented, but as change requirements continue to swell and the migration cut-over date nears, developers resort to quick fixes - an extra update script here, an undocumented control script there. This makes it hard to keep track of the project, let alone identify the source of any inconsistent behavior during the conversion. Brute force is used to get the conversion to go the way you want - a recipe for script spaghetti. It is very hard to break this vicious cycle and almost impossible to reuse the scripts for later interfaces.

Beware of enterprise resource planning (ERP) vendors and system integrators claiming they have "productivity tools" that are nothing more than custom scripts developed for one of their former clients and add almost no value to the project. Hearing them out may be a good way to assess their level of experience and understanding of data migrations, but don't stop your evaluation until you have looked at some specialized productivity tools designed specifically for data migrations. These tools exist, and in some cases they are further specialized to address the unique migration requirements of a specific enterprise application. By definition, productivity tools should make less-skilled resources more productive. By extension this means you should look for tools that can make your own resources with less target system expertise just as productive as experienced consultants and make expert consultants even more effective.

When considering the adoption of specialized productivity tools in a migration project, four issues should be considered:

  1. The cost implications of using tools: Initial investment is required, but with the right tool, payback is usually achieved long before the end of the project.
  2. The time implications of using tools: Time is money, and by definition, a productivity tool will get the job done in a shorter time frame.
  3. The quality implications of using tools: When used correctly and innovatively, the use of productivity tools will result in higher quality data.
  4. Other benefits achieved by using tools: More end-user involvement, reusable components, documentation and audit trail.

The first three issues have already been discussed, so let's consider the fourth. Nobody knows your data better than the users who work with it every day. By involving them in the data migration process, you encourage ownership and can resolve problems in the source data more effectively. Unfortunately, when a data migration is done without productivity tools, it is difficult to share and communicate to nontechnical users as they don't care to look at scripts and complex logs. Productivity tools can present data pre- and post-transformation in a way that end users can understand and interpret.
A lot of work goes into developing data migration routines. The ability to reuse these in the future for subsequent conversions or for ongoing interfaces can contribute greatly to your ROI. A tool that supports some sort of customizable template paradigm will allow you to get multiple uses out of one piece of work. Good productivity tools are also able to generate documentation or come with already-documented migration templates that are specific to a packaged application. Having this functionality allows the inevitable breakup of the project team to have little effect. Future database administrators, programmers or end users can troubleshoot or make changes to the data and database long after the data migration has been completed.

Proven Methodology

The need for a proven methodology for data migration goes hand in hand with productivity tools. The best tools actually enforce a methodology, thereby enhancing their value to a migration project. A proven methodology is a step-by-step process that has been used successfully in a situation similar to yours. Asking questions about a system integrator's methodology for data migration is a good way to gauge his understanding of the complexities of the task at hand. Don't allow anyone to dismiss the fact that enterprise data conversions are complex processes that cannot be implemented successfully without a thorough understanding of the challenges involved. If there is nothing more than a bullet point on their project plan showing "data migration," then you are better off steering clear. Ask questions about reference projects where they successfully applied methodology X and used productivity tool Y to migrate data to system Z.

When faced with the daunting task of a data migration, your first order of business is to acknowledge that the project will be unlike any you have ever been part of. The data will be unique and, therefore, there is no possibility of precedent. Begin by planning and making strategic decisions about historic data and poor quality data disposition. A migration-specific productivity tool and a proven methodology combined with skilled resources is your best path to smooth migrations. While you are not likely going to be able to sit back, relax and watch the magic happen, you will at least go to bed at a reasonable hour every night knowing that you are well prepared for success.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access