Continue in 2 seconds

Data Migration - A Divine Comedy?

  • September 01 2007, 1:00am EDT

If what you read on the Web is to be believed, then all data migration projects are akin to Dante Alighieri's descent into Hell - "Abandon all hope, you who enter." White paper after white paper quotes the Standish Group as saying that more than 80 percent of data migrations either fail entirely or overrun their planned budgets and timelines. Others claim that data migration projects can ' t get experienced staff because no one wants to do a second one. Never in the history of IT has a given class of project been so maligned. Do data migration projects really deserve this reputation, and if so, what can you do to beat the odds?

Rocket Science or a Walk in the Park?

In considering this outpouring of doom and gloom, it's worth noting that a large number of sites quoting these statistics are trying to sell you something; which goes some way to explaining the number of times the figures are quoted, but it doesn't automatically follow that we can dismiss the statistics themselves. The bottom line is that there is a disproportionately high failure rate for data migration projects, and the reason for this is that they are hard.

On the face of it, data migration does look easy. After all, isn't it just a matter of copying customer names from system A to system B? As the saying goes, it's not exactly rocket science. Of course, there may be a few little issues to solve along the way - what if the name field in system A is longer than is allowed in system B? What if special characters in system A aren't recognized by system B? What if customer data is coming from multiple source systems and needs to be cross-matched before being loaded to system B? What if there are multiple versions of the same customer in system A that need to be identified and merged into a single customer record before being loaded into system B? What if data model differences mean that one customer in system A translates to many customers in system B? Okay, so there may be a few challenges along the way, but surely even an average analyst can solve all these problems. What is it that makes these projects hard?

Don't Underestimate the Complexity of a Migration Project

It is an inherent characteristic of data migration projects that they involve two or more systems. As a result, analysts and developers need in-depth knowledge of at least two systems. If both of these are complex enterprise resource planning (ERP)-type systems then that's hard. If one or more of the systems, has been outsourced to a third party, that makes it even more difficult - particularly if the endpoint of the migration project is to decommission a source system. It's hard to get people to cooperate fully and freely if it means they'll be out of a job. This is complicated by the fact that the target system is usually new to the organization and nobody really knows what it's supposed to do or how it's supposed to do it.

More importantly, from a management perspective, two or more systems also mean twice as much politics and often conflicting agendas; that means more risk. These are all things that good project managers can handle as long as they go into the project with open eyes and with the full support of the executive team. This is not a project for inexperienced managers to muddle through in isolation.

Lesson number 1: Use an experienced project manager with a sound track record for delivering projects and ensure that this project is backed up by executive-level support.

Change is Inevitable

One common complication in data migration projects is that the target is either evolving during the course of the project or the team's understanding of the data it requires is evolving. This problem is common even when the target's data requirements are formally documented. This issue is compounded by the fact that the source system may also be undergoing changes during the course of the project. The world of business rarely stands still. There are new products to deal with, new tax/accounting/compliance laws and regulations to meet and new production problems that lead to unexpected and even erroneous data appearing in the source system. As a result, the data migration project team is effectively trying to build a bridge from a moving target to a moving target.

To some extent, these problems can be managed through sound change management processes and through getting agreement and commitment to clear interface contracts. But the fact remains that many of these changes are not optional and the data migration team simply has to cope with this potential activity.

A key strategy to managing this issue is to break up the migration project into the smallest chunks possible. This is a good IT management practice that is too often ignored on data migration projects, usually because the true size of the project is initially underestimated. It's a simple fact that there is less exposure to change in a short time period than in a long one. Plus, each migration subproject only needs to deal with change in the area of focus for that particular migration piece.

Added benefits from breaking the project up into manageable areas include:

  • Visible progress boosts team morale;
  • Later subprojects benefit from lessons learned in earlier deliveries; and
  • Performance issues can be addressed. Instead of having to migrate all current data plus all history in a limited window, previous data could be migrated first, leaving only current data to be processed during the limited window.

Lesson number 2: Find a way to break large migration projects into smaller ones.

What's in the Source?

There is generally a significant gap between the data a source system is supposed to contain and the data it actually contains. It's natural for people to focus on the data they know (or think) is there rather than looking for gaps and exceptions. But the gaps and exceptions cause project overruns.

There are two issues to be considered:

  • What is the actual quality of source data versus what is required for the migration (and who's going to fill in the gap); and
  • Do we really understand what's under the hood in the source system to confidently map it to a target?

In both cases, data profiling tools can help you understand the size of the problem. These tools are useful for highlighting to both analysts and managers the size and number of data quality issues. But it's important to recognize that the tools don't fix the problems.
The first step in addressing data quality problems is to assess the impact of the problem and the number of records affected. This is critical information for the second step - setting priorities. It's important to establish a formal process for prioritizing and fixing data quality issues. Too often, this work is left until too late in the project, and instead of being managed properly, issues are dumped on unsuspecting businesspeople, making it somebody else's problem. Bear in mind that there may not be a business case for addressing low-impact problems, so the data migration team may need to deal with them in some other way (for example, by using a default value - ideally one that can be searched on in case later remedial work is required).

Lesson number 3: Find, prioritize and address data quality problems.

It's a Throwaway - Right?

In theory, system A is being migrated to system B once , then system B goes live and system A is decommissioned - so there's no point in fussing over the quality of the migration code, right? This is not only a naïve view, it's just plain wrong.

First of all, it is always worth writing good quality code. If you've got people on your team who genuinely believe that it is quicker to produce and fully test poor quality code than good quality code, you need to look seriously at the makeup of your team.

Secondly, migration code is never run just once. A company's data is a valuable and often irreplaceable asset. It is vital that the data migration process is tested, validated and fully reconciled before the final run. Usually this happens many, many times before all issues are resolved and all discrepancies are explained to the satisfaction of the business owners.

Lesson number 4: Good quality code is vital to the success of the migration.

Data Migration is an IT Project

Because data migration projects don't in themselves add new business functions, many companies leave it to the IT department to handle alone. Unfortunately, this approach ignores the fact that the business owns the data and the business needs to live with any unintended consequences of the migration. When migration projects go off the rails, the business bears both the financial and nonfinancial burden.

Active involvement by key business users throughout the project is vital to:

  • Provide valuable insight into the meaning and use of the data that is being migrated;
  • Prioritize and fix data quality issues;
  • Minimize/delay changes to source and target systems; and
  • Validate the success of the migration.

Lesson number 5: The business and IT must work together to ensure that the migration aligns with the business needs.

The Bottom Line

In Dante's Divine Comedy, the exhortation above the gates of Hell to "Abandon all hope," in this instance, proved unwarranted. The Roman poet Virgil leads Dante safely through the nine circles of Hell and the story ends happily. Dante had an excellent leader to guide him on his harrowing journey, and they both went through the gates well aware of the enormity and the risks of what they were attempting. It could even be noted that they clearly had project buy-in from the executive level, but that would probably be taking the analogy a bit too far.

Surprisingly, there are some parallels between this epic poem and modern day data migration projects. The three books of the Divine Comedy serve remarkably well in classifying data migration projects. They usually fall neatly into one of the following categories.

  • Inferno - Unmitigated disasters;
  • Purgatorio -Disasters that eventually end (late and over budget, but mission accomplished); or
  • Paradiso - Well run projects that deliver on time and on budget.

Fortunately, we can learn from the experience of others and choose our path. With the right approach and the right team data migration projects can be brought in on time, on budget and with no casualties.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access