Over the past few years, as virtualization has begun to get more buzz in the industry, many people have asked me what business benefits can be derived from virtualizing their existing storage infrastructure, what problems it solves and what the cost justifications are for taking the plunge into a storage virtualization solution. I will begin by looking at the differences between a traditional center data model a virtualized center data model.
In the traditional model, storage is separated into discreet elements of management. Operations staff must rely on the integrated provisioning and management interfaces provided by the storage array vendors and is essentially what they want. All data services (provisioning, backup, recovery, replication, etc.) are typically provided by the storage array, on the host itself or through a third-party application such as backup software. The management of discreet elements is divided into the network, applications, storage services, and the individual storage devices, as illustrated in Figure 1.
In the virtualized data center model, an abstraction layer is applied between the compute elements (your server vendors) and the storage elements (your storage vendors). Once this abstraction layer is in place, you can classify storage resources and form them into pools based on performance, reliability or cost. The management is consolidated into clouds based on network, compute, data services and storage pools, as illustrated in Figure 2.
Provisioning can now be applied from a pool of storage based on the service level requirements of the application, and data can be moved among the pools based on the age of the data. The virtualization solution can also provide all the other critical data services such as backup, replication, data migration and recovery. By moving the storage into pools based on cost and specifically moving the data to a less costly storage pool as the data ages, a true information lifecycle management (ILM) paradigm can be achieved. I call this intelligent abstraction layer the data services engine.
Top 10 Reasons to use Storage Virtualization
Storage virtualization can solve myriad issues. The top 10 issues, on order of importance, are as follows.
1. Simplifying provisioning. This is especially true for larger organizations that have implemented storage solutions from multiple vendors or for companies that use a tiered storage model. The beautiful thing about a virtual storage infrastructure is that all provisioning can be accomplished the same way across heterogeneous storage arrays from different vendors. The ability to use a single pane of glass for provisioning reduces the learning curve for operations staff, which increases operational efficiency and enables new applications to come online faster. The virtualization solution can provide other advantages such as thin provisioning (that simplifies storage growth requirements) or capacity on demand (that enables better storage utilization with just-in-time provisioning).
2. Data mobility. Simplifying data migrations and technology refresh is beginning to take on greater importance as more and more data is being stored for longer periods. When a storage array finally runs out of room, performance or comes off maintenance, it is sometimes a difficult and time-consuming process to migrate data between existing and new hardware. Data movement may also be disruptive to the running application servers. When large arrays storing multiple terabytes are in use, many servers can be affected, causing downtime. With a virtual abstraction layer in place, all data movement between the storage pools is transparent to the application hosting the data. Migrations can be done at any time without the need to bring down critical applications, which reduces downtime and makes for happy end users.
3. Fixing backup problems. This is where the data services engine can really shine. Most companies are still using traditional tape backup methods for data protection and recovery. Tape has always been cheaper than disk and because the media is removable, it can be shipped and stored for disaster recovery or data archiving.
This is all good, but it does not fix the critical issues around more stringent recovery objectives. The recovery point objective (RPO) capabilities of tape backup cannot hold a candle to newer technologies such as continuous data protection (CDP). The recovery time objective of tape can also take many more hours than a disk-based recovery solution. Virtualization fixes all this by leveraging virtual tape libraries (VTL) to make disk drives look like tape drives to the existing backup software.
VTL can be integrated into the virtualization abstraction layer and presented as an archive pool. So even when data is archived, it can be recovered at disk speeds. VTL can also be integrated with CDP technology to make things even simpler. CDP can store data as points in time from the applications perspective in native format (also called an active target) for immediate recovery. The point in time can also be presented to the existing backup software for backup at rapid speeds directly to the VTL over the storage area network (SAN). This is known as serverless backup. Integrating these two newer technologies through virtualization provides data protection as a service to the application. Imagine a world where backup just happens all the time, and if any viruses or corruption happen, the administrator can immediately roll back to a point in time just before the problem occurred.
4. Fixing performance problems. In a typical SAN setup, each host is equipped with at least two host bus adapters (HBA) to provide path failover, in case something happens to a path, and load balancing to distribute input/output (I/O) across both paths to increase performance. This is demonstrated in Figure 3.
When performance is still not enough, or when the server administrator needs a simpler and more granular way to manage disk I/O resources, host-based volume managers can be used to abstract the underlying disks. When more logical unit numbers (LUNs) are added to increase storage space, they can be added to a virtual volume to also help increase performance, as demonstrated in Figure 4.
Although host-based volume managers are helpful in managing disk resources, leveraging virtualization at the fabric level adds the additional advantage of using all the ports of the storage array to access a single LUN the ability to stripe disk access across all the ports of many storage arrays within a pool of storage. The abstraction layer can disaster recoveryamatically increase I/O performance by striping all I/O across many more available paths, which not only increases performance, but can also increase reliability. Figure 5 demonstrates this concept.
Each LUN in the storage array can be assigned to every front-end port, and the virtualization layer can provide virtual volume management at the fabric level to increase performance.
5. Ability to pool storage by class. There are different types of storage arrays available on the market, and each type is usually priced by inherent data services capabilities, performance characteristics and reliability. A massive monolithic Fibre Channel disk array can provide many more capabilities over a simple modular serial advanced technology attachment (SATA) array, but usually at a significant price differential. Many companies have both types of storage in-house (called tiered storage), but they want a simple way to be able to move data between these different arrays as data ages in an ILM process. A virtual abstraction layer simplifies the data movement between unlike arrays, even those from those different vendors, and makes the movement transparent to the hosts. By pooling storage into different classes based on cost, companies can actually to realistically begin to implement ILM.
6. Ability to pool data by class. This benefit goes hand in hand with the ability to pool storage by class for creating policies that automatically migrate data to a lower cost pool of storage based on its age and importance to the organization. Data classification and policy creation is required to gain benefits from a virtual storage infrastructure that is tied into the needs of the business. This capability is sometimes referred to as service-oriented architecture (SOA).
7. Reducing costs for disaster recovery. Disaster recovery is the killer app for virtualization. Imagine not having to mirror your physical infrastructure at a disaster recovery location. By virtualizing the compute layer, the physical server footprint required in your disaster recovery location can also be reduced by virtualizing the storage, organizations can optimize storage utilization, reducing the physical storage footprint and thus creating a greener data center. Because the storage is virtual, it can be provisioned over any protocol. If an expensive SAN is used at the production location, all that is needed is a network to provide iSCSI disk access over Ethernet at the disaster recovery site.
Because the storage is virtual, it is possible use any storage array at the disaster recovery location. You can replicate from expensive to inexpensive storage or from massive monolithic arrays to modular SATA or SAS arrays. Providing the replication intelligence in the virtualization layer also negates the need for array-based licensing of replication firmware, which means you can buy storage in bulk at much less expensive costs from your existing storage vendor.
8. Improving reliability. The most reliable storage arrays on the market today are sold or rebranded by the usual providers Hitachi Data Systems, IBM, Sun, EMC, HP, etc. These big iron boxes have redundancy built in so they rarely go down, but they can also come with big iron prices that many companies find hard to swallow. One of the reasons big boxes are expensive is the extra capabilities provided in the firmware, such as replication, business continuance volume creation, mirroring, etc., in order to ensure a reliable infrastructure.
Many companies find they can save money and get the same reliability aspects of the big iron boxes by simply adding virtualization on top of more cost effective modular storage arrays, and by either mirroring or striping the data across arrays. Figure 6 illustrates the use of virtulization to improve reliability.
By mirroring the data across lower cost arrays, you gain the benefits of not only doubling the uptime service level agreement by having multiple physical frames to store the data, but also the linear performance benefits of adding more controllers and disk spindles for spreading out the I/O load.
9. Keeping your storage vendor honest. This one is easy. If you remove the requirements for having intelligence in the storage array by placing it at the fabric level in the virtualization layer, you are no longer locked in with a single storage vendor for critical data services. If you are no longer tied to a single vendor for your storage purchases, you can ask for the best price and buy storage as a commodity. If they try to enforce traditional pricing models, you can simply move on to another storage vendor.
10. Saving money. This benefit is a natural result of employing all of the previous benefits.
Although cost benefits (and therefore cost justifications) for moving to storage virtualization can be derived from outlined items, the overall value is greatly increased when the benefits are applied to larger environments where complexity tends to be greater. Although small shops will see a cost savings both operationally and in capital costs, its typically the larger shops that reap the greatest benefits from storage virtualization.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access