- How am I going to determine the performance criteria?
- What are the risks and how am I going to manage those?
- Do I have a failure mode effect analysis document which captures the failure modes and does a cause and effect analysis?
There are many other questions that arise. Let us look at the different strategies and checkpoints which are an integral part of a data migration project.
While implementing data migration architecture, the data migration team has to take a number of considerations. Following are a few of such considerations:
- Data volume analysis
- Source system and target system processing power
- Complexity of data mapping rules and business rules
Figure 1: Point-to-Point Data Migration Architecture
If during transformation, several records are normalized into separate database records that will result in a significant increase in the overall data volume, extract data from the source system(s) as is and move it to a staging area in the target system, then apply cleansing and transformations locally.
- Reduced network round trip
- Local transformations means the actual data migration process is over, data has actually reached the targeted server.
- Leverage processing power of target server
Figure 2: Hub-Spoke Data Migration Architecture
Figure 2 shows partitioning source data and preconverting the historic data in the source environment and supports any number of source and target systems (spokes) while managing the overall ETL processes through a hub.
- Can accommodate any number of sources and/or targets
- Data rules are kept at a separate layer
- Load balanced on target server
Making Sense of Data Migration
The biggest challenge with a data migration project is: Making the target system understand what the source system is telling it.
Here are a few key practices and issues with data migration projects.
Comprehensive Mapping. Every data field that is going to be migrated from the source system to the target system must be defined and examined to ensure compliance with field lengths, data types, domain values permitted, system rules, integrity checks and any other possible issues.
A detailed data map is critical to understanding where information is going as well as whether there are any known or avoidable obstacles in the way of successfully arriving there.
A good data map will detail an in-depth cross-referencing of all mutual fields across the source system and the target system. Ideally it should include:
- Names of applicable to and from fields
- Lengths and data types of these fields
- Any logic involved in mapping such as string truncations or validations against any business rules
Extract Validation. Data in source system is known to contain problems or can be unknowingly incorrect due to many possible factors including human keying-in errors and/or a lack of checks and accountability particularly in less sophisticated systems. Any validation rules that can be utilized to locate and fix these problems should be performed on the first-pass data extract extending the process to multiple iterations if required.
It is common that some errors will not surface until others have been identified and fixed. Whereas the source system may ignore the discrepancies with, for example, items like same persons billing address recorded different in different files or database tables, the target system, potentially having better business rules, may opt for a Type I or Type II slowly changing dimension implementation for the same address change. Data validation and clean up is an essential and key component of a good migration plan.
Quality Transformation. Data extracted from the source system needs to be transformed or translated into a format that the target system can import and understand. This transformation will not only the defined data mappings, but will also execute any underlying business logic functions that may be essential to populating more complex data structures.
Fortunately, these stages can be efficiently performed by technically advanced ETL tools such as Informatica, Ab Initio, DataStage, etc.
So far we have discussed the technical aspect of data migration. However, as with many IT projects, the what, where and when is just as important as the how. When dealing with the management aspect of a data migration project, following issues should be considered.
Phased or Big Bang Approach?
When choosing to migrate data from one system to another, does it make sense to try to accomplish this all at once or move data over through a controlled phase of multiple releases?
Naturally there will be pros and cons to both options, considering which approach will best fit your organization needs to be evaluated on a variety of factors.
Some examples of these factors can be as straightforward as how much data there is to migrate or as seemingly abstract as the amount of training effort it will take to make a "big bang" worthwhile to your organization in terms of the ROI.
How long will the migration take place? How many internal resources must the client IS team commit to the migration and for what period of time? What is the impact on the other business-critical processes? What is the cost?