Shaped by scandals and forged by legitimate demand, today's business environment dictates that organizations maintain accountability and justify executive decisions. IT departments share the responsibility with company executives to ensure corporate success and freedom from litigation and potential jail time. Key technologies that maintain and manage the critical aspects of information come in the form of information lifecycle management (ILM).
Briefly, ILM is the optimum allocation of storage resources that support a business. Every element of information in an organization has a useful lifespan. That life may be short, as a voice conversation is, or may be very long, as is required for certain legal and medical records - as long as a human lifetime. ILM is the application of rigor to the often chaotic and unstructured data stores than an organization maintains. The storage, utilization, maintenance and destruction of this data can be quite expensive over its lifetime, and what is worse, its lifetime is often much longer than its useful life. The art of ILM is to develop an understanding of an organization's information needs and to develop the infrastructure and processes required to maintain the usefulness of the information, while at the same time creating the discipline to minimize the cost of that maintenance. The choices and strategies available today for those companies that are considering an ILM implementation include the following:
Tiered storage is at the heart of an ILM implementation. The value of ILM is the ability to tie the cost of storage to the value of the information on it. The most important data, or the most performance-critical data, should be placed on the highest performance and most expensive storage. Likewise, information that is accessed infrequently, either due to its age or other factors, can be stored on lower-performing and lower-cost storage platforms. In this way, you can correlate the cost of storage to the value received.
Thus, we see the beginnings of an ILM implementation strategy coming in the form of a two-tier architecture: high-performance disk for mission-critical data and low-cost disk for aging or noncritical data. Most organizations will add a third tier to complete the implementation, the archive tier. Studies have shown that information is accessed most frequently shortly after its creation, and access drops off sharply soon after, often within days. Within weeks or months, a given piece of information has almost no chance of being accessed or modified, but it must be retained for a much longer time due to regulatory standards, policies or other constraints. By analyzing your usage patterns, you can determine where the point of last access is likely to occur and move all data older than this to the archive tier.
Archive technology has more value in an organization than just a low cost-per-megabyte hardware platform. Most archive platforms offer a data reduction mechanism that allows for the elimination of redundant copies of data, thereby reducing storage demand. In addition, data in an archive tier is typically backed up only once, either through usual backup means or via data replication to a redundant archive. The cost savings from not copying this same aged data to tape backup on the standard weekly rotation can be tremendous. If you perform full backups weekly and store them offsite for one year, you remove 52 terabytes from your tape infrastructure and storage vaulting costs for every terabyte moved to the archive. In addition to the cost benefits of an archive tier, many organizations benefit from the legal and regulatory compliance improvements inherent in a comprehensive archive strategy.
Data retention is legislated for many types of information, such as financial transactions and health care information. The ability to retrieve this information from tape, however, becomes more difficult with each passing year. Tape drive technology is continually improving, and the newer drives often become incompatible with older media after just a few generations - often after five years or less. What's more, backup software frequently changes and becomes obsolete, and software vendors are acquired or go out of business. The challenges of retrieving data under these circumstances can be insurmountable, exposing the organization to unacceptable risk.
These problems are largely eliminated when data is archived in an online archive tier. Information is online, so retrieval from an unsupported or obsolete media is irrelevant. Software is available to read the data files as long as the source application is in use, and software revision planning can account for the archived data in the same fashion as the primary production files.
Also, data stored in backup tapes does not easily lend itself to legally binding demands for retrieval. Imagine the litany of pressing questions: "When was that email sent? Which backup did it go onto, and which tapes comprise the backup set?" If email is archived online, however, there is no need to return multiple backup tapes from an off-site vault, restore each in turn and look for the data in question. Online tools can be employed to search the archive automatically, supplying the required information in minutes or even seconds, rather than in hours or days, dramatically lowering both discovery costs and the amount of requisite human intervention. In addition, policy-based archiving can ensure the obsolescence and destruction of data that has passed its useful life, eliminating the need for discovery and retrieval altogether.
Data destruction is another key element to information life - don't forget that it's not over 'til it's over. That is, the information lifecycle doesn't end until you can guarantee that the data under your control has ended its life and been destroyed. Keep in mind that any data destruction procedures must be defined by company policy and be in force before you are legally required to locate and retrieve specific information. Once you are bound by legal discovery, the destruction of any relevant data is illegal and will land you in jail.
The Growing Need for ILM
Although the need for an effective ILM strategy has intensified recently, definitions of ILM go back many years. Nearly six years ago, the International Standards Organization defined a standard on records management, ISO 15489: 2001. In 2004, the Storage Network Industry Association (SNIA) attempted to assign a new definition to ILM: Information lifecycle management is comprised of the policies, processes, practices and tools used to align the business value of information with the most appropriate and cost-effective IT infrastructure from the time information is conceived through its final disposition. Information is aligned with business processes through management policies and service levels associated with applications, metadata, information and data.
Today the demand for ILM products and services continues to grow. Forecast data growth rates of 30 to 50 percent are common, and some organizations have documented rates in excess of 100 percent annually for the past several years. Government and industry regulatory bodies are active in their push for improved accountability. Identity theft and other scams will drive additional data storage and regulatory requirements, which will require more storage and improved abilities to accurately find and retrieve old data. The trick is not to break the bank doing so.
Large enterprises are spending millions of dollars per year on their storage environment. With even small organizations struggling to manage multiple terabytes of information, it is no wonder that software providers have come to the rescue. The ability to manage petabytes of storage comprised of multiple tiers of disk, the need for long-term storage and retrieval, and the ability to satisfy government and corporate regulators all require sophisticated software solutions that provide the ability to meet these disparate demands. The cost of these software solutions is significant. Plus, the ability to integrate them with the enterprise is challenging and requires skills that are beyond "business as usual" for most IT professionals. Like most technology, ILM infrastructure and methodology are hard to manage, and the resources required to execute a coherent ILM strategy are expensive and hard to come by. In addition, these are not the people you need to keep on staff over the long term. That is, once transformation is complete and administrators are trained, they become redundant.
What's more, the emergence of electronic means of storing and transmitting information, for documents as well as communications, creates havoc for IT personnel that must satisfy the requirements for legal discovery. E-discovery, which took effect December 1, 2006, is the modern mechanism for this long-standing practice and dictates that anything stored electronically could at some point be admissible as evidence in court. Email, file servers, databases and laptops - none of these are exempt from legal discovery requirements. When lawyers want to serve a subpoena, IT is required to discover and report every electronic communication and document that the court requires. Many companies currently employ large staffs just to satisfy this one requirement, and the ability to quickly and efficiently manage this process is becoming ever more important.
Remember that ILM means information lifecycle management, with all three words representing key components of this important process. The proper tools and techniques are merely steps along the way toward developing expertise in new methods of managing information. The long-term health of your data, and of your company, requires your active participation.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access