Continue in 2 seconds

A Prescription for Replication ­– Just What the Doctor Ordered

  • Mona Hogue, Tony Stafford
  • January 01 2002, 1:00am EST

In today's global enterprise, the need for large scale, near real-time data replication is reminiscent of the old adage about the common cold: sometimes the cure is worse than the illness. Take antihistamines, you feel wretched. Antibiotics build up a resistance to something more serious – and don't really do a thing for the bug that has you down. Steroids get you feeling better faster, but may cause heart failure. With its medicinal properties still under debate in the New England Journal of Medicine, chicken soup, basically, will make your mom happy.

However, the problems that mandate enterprises to adopt effective replication strategies are not like the common cold at all. Left untreated, they will not simply go away. Gartner estimates that by 2002, storage replication technologies will be deployed in 70 percent of the Fortune 500 companies. Today's data replication market exceeds $700 million and will grow to about $1.3 billion by 2003.

Enterprises have no choice but to adopt effective replication strategies as most suffer from conflicting mandates. For example, businesses require high availability for critical applications but provide too narrow a backup window to protect vital information resources and keep backup data current.

Which Enterprises Will Benefit?

For those enterprises that have lost data due to storage-related problems, have an insufficient backup window to protect data while meeting user needs or want to reduce dependence on proprie-tary solutions, replication is a must-have process. Other enterprises that will benefit from replication include those that deploy more than 20 servers, run multiple operating systems and require high availability for applications and data. Large-scale distributed systems in which storage growth and its problems have become visible and for which total cost of ownership and service-level- agreement monitoring have become issues are also prime candidates, as are those that are experiencing high rates of change, consolidating data centers or engaging in mergers and acquisitions.

Host-Based Replication for Mainframe or Open Systems

Mainframe host-based replication or migration can be implemented with software that is independent of the storage vendor subsystem. It only requires that the architecture of the source and target storage devices is the same. This implementation is a hybrid between what is commonly called a mirror (described later) and what we know as replication. It is asynchronous, but it can be synchronized at specific times. It allows an application to be nondisruptively switched to the migrated volumes. It is implemented at a volume level; but because it is host-based, there are many facilities that allow coordination with host applications.

It is also used for replacing storage subsystems nondisruptively, creating point-in-time copies of data for backups, application testing and data center migrations/replications. Both the source and target storage devices must be accessible from the host, although the storage can be attached over fiber with channel extenders.

The benefit of mainframe host-based replication/migration is that it is independent of the storage subsystem, is completely nondisruptive and allows full use of all available storage. Most implementations of host-based replication require some storage allocated to logging and recovery.

Open systems host- or server-based replication can also be implemented as a software-only solution. Some component or agent must be installed on every server that participates in the replication process. These agents manage the data transfer and various integrity states and also control when and how the replicated copy can be made available.

Open systems server-based replication is completely "vendor independent," managing devices from multiple storage manufacturers. Replication can be controlled on the production server and usually can be easily integrated into production schedules.

It can simplify configuration and operation because the replication transparently layers on top of existing storage devices, providing the same view of the storage subsystem to the application regardless of physical storage implementation. This attri-bute can be important in the case of databases that span multiple volumes and that usually implement logs and recovery data sets on separate high-performance devices. With server-based replication, it is also possible to group volumes together to form a "logical group," which maintains write-order integrity. This means that the order of I/O across this group of volumes can be maintained at the remote site – a valuable tool for keeping data at the remote site consistent in the event of failure.

Open systems server-based replication usually will be configured server- to-server rather than as dual I/O from a single server, the most appropriate scheme for extended distance disaster recovery operations or for data replicated for off-host processing.

Network-Based Replication

Relatively new, network-based replication is implemented in the storage area network (SAN) infrastructure and currently does not apply to the mainframe. In some ways, it is similar to controller-based replication because it is independent of the host operating system and hardware. Unlike server-based solutions, network-based replication requires no component installation on application servers.

Single server network- and controller-based replication both bypass the application server to write to the remote location storage device. In effect, network- and controller-based replication is "remote mirroring."

Network replication, however, can be implemented only in a SAN without any knowledge of the application server, which makes it difficult to synchronize snapshots and point-in-time copies or to invoke and be invoked by server-based processes. Network replication works only at the volume level, oblivious to file systems or mount points. In this relatively new solution, features such as logical volume grouping – a requirement for remote asynchronous replication – have yet to mature. Likewise, all volumes usually must be controlled by a single network device to achieve any multivolume synchronization.

Controller-Based Replication

The most common and mature replication technology, controller-based replication, operates at the storage-controller level. This mirroring technology now has both remote and asynchronous extensions. The controller-to-controller configuration usually requires additional controller features and micro code, which can be costly and require identical setup at the remote location.

A key disadvantage is storage vendor "lock-in" at both local and remote locations. Usually enterprises are forced to acquire expensive, high-end storage controllers to implement replication. For those enterprises planning to use less demanding and inexpensive systems at remote sites while installing their high-performance storage subsystems locally, this could be an unnecessary expense.

Although most of the disadvantages of network-based solutions also apply to controller-based solutions, a key differentiator is that controller-based replication only handles volumes as the controller sees them, not as the application server sees them. In many operating systems commonly used throughout global enterprises, including Windows 2000 and Solaris, it is possible to create logical volumes from multiple physical volumes at the application-server level. These volumes, which aren't the same as controller volumes, can span controllers and even manufacturers. Great care must be taken to ensure that whole logical volumes are replicated.

Many controller implementations require an engineer to change the configuration and cannot be done simply through a software console. This makes the implementation inflexible and expensive to change. As for system resources, some controller-based implementations require set-aside reserved business continuance volumes. If this is implemented on a redundant array of independent disks (RAID) device, which requires 20 percent reserved space, then considerable amounts of additional storage may be required.

Some network and controller replication implementations offer a snapshot facility, which is used primarily as a stepping stone to tape backups. The snapshot does not create a full copy of the data, but provides for access to a virtual point-in- time copy of original data and logged updates of all prior images, blocks and tracks.

Replication Versus Mirroring

It is easy to confuse mirroring and replication because the two technologies are merging in various implementations.

Typically mirroring is synchronous. Both copies must be written before the I/O is completed back to the application. The application server views both the source and the mirrored target volumes as local.

Replication, on the other hand, is generally asynchronous with source and target linked by wide-area network TCP/IP. The greatest complication is ensuring that the remote data is always in a consistent state, rather than an identical state, to the local data. This means you have to preserve write order over the TCP/IP link and resynchronize after the connection is lost.

Some replication implementations, however, can be configured with both source and target viewed as synchronous local, which effectively creates a mirror using replication technology. This may achieve additional functionality for recovery from breaks, resynchronization and for creating point-in-time copies, which makes replication a superior solution over mirroring. Another advantage of replication over mirroring is that you can delay updates to the replicated data set behind the primary volumes. This helps avoid propagating bad updates, providing time to create point-in-time copies prior to corruptions.

The Future of Replication

In terms of the disaster recovery market, replication ranks third behind RAID implementations and mirroring technology. It offers a cost-effective solution for very remote (more than 50-100km) disaster recovery sites. There will always be a need to failover to systems beyond the capability of fibre (i.e., over 50km).

Off-host processing will continue to offer more flexibility than other techniques. It is more efficient and more importantly offers additional options in terms of creating application-consistent copies of data volumes.

As SANs become more common, replication will migrate into the storage- vendor independent SAN – primarily from vendor-dependent controller-based replication. Host-based replication will maintain its importance for off-host processing but must evolve more to increase the ability to integrate with application server processes.

Replication is not optional. The 24x7 global economy demands that enterprises have high-performance data availability solutions that can be easily and cost effectively implemented and scaled to duplicate critical data using existing resources. Such solutions must be compatible with existing procedures and staff skills. Both standard recovery and disaster recovery operations must perform. The solutions also should be available on multiple platforms to avoid the risk of proprietary vendor lock- in. Regardless of where the data is stored, the solution must demonstrate dynamic flexibility and scalability of automated replication, all from a single point of control.

Doing something is better than doing nothing. The challenge will be selecting the right cure for your enterprise.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access