Database archiving is becoming an important new topic for data managers. The need for this function has surfaced at most IT organizations, and the problems it addresses are only getting bigger and bigger. These problems include challenges with data retention requirements, application renovations and e-discovery. Most IT data managers recognize the problems but many do not associate database archiving as a solution. This will change as the technology matures and spreads. Database Archiving is the practice of removing selective business that are not expected to be referenced again records from operational databases and storing them in a separate archive data store where they can be retrieved if needed. In essence, it partitions the application database into the operational database (current business records that are still of value to the business) and the archive data store (inactive business records that need to be retained but that have no expectation of being used again for business purposes). You don't archive databases; you archive business records from databases. Database archiving is an electronic-form of records retention. For example, a banking application has an operational database containing data for transactions (such as deposits and withdrawals). As data for a single transaction ages, it reaches a point where all intended or expected business uses of the information have been accomplished. The business record includes all data relative to the transaction, including reference information pointed to from the transaction. For example, customer name and address may be copied from the customer master record in order to complete the business record that is moved to the archive.  The timing of when an instance of the business record type is ready to be moved is determined by a policy set by the archive designer. This policy may be simple (90 days after create) or complex (one year after create unless the account is flagged as under review or the account has a negative balance).

Figure 1 shows the phases of the life of a business record as it pertains to database archiving. A business record gets created and remains in the operational state as long as it can be updated or participate in the creation or updating of other data records. For example, the banking transaction would be operational from the time it is created until it updates the customer's master account record, creates a record for the financial system and possibly gets updated with a flag indicating it has been audited. Data is in a reference state if it can no longer be changed or create other data records or change other data records but is still expected to be used for other, read-only purposes. This includes report generation (detailed or summary), extract processing for business intelligence data stores or anticipated customer inquiries.  Data often reaches a state where all changes are final and all expected reference uses have been accomplished. This is the inactive state. The bank no longer needs this transaction. However, the bank may be required to keep the data available for many more years to satisfy government regulations, or the bank may chose to keep the data for a longer period for un-anticipated uses. Generally, the time required by law for retention exceeds the time the bank would prefer to keep the data. Data in the inactive state usually does not need access to the application programs. Accessing the data for any unplanned uses can be accomplished through simpler generic query and reporting tools. The data can safely be separated from the application and operational environments.  Some business record types become inactive and can be archived almost as soon as they are generated. Other types may never reach a point where they can exist independently from the operational environment. However, most applications have transaction data where the data can be safely moved to a database archive for 80 to 90 percent of their required retention period. The data lifecycle and the ability to achieve application and system independence determine the suitability of the data for database archiving. If it qualifies, then as much as 90 percent of the operational data can be offloaded from the operational systems and retained in an archive data store that is cheaper and more efficient for managing inactive data. Database archiving is a subset of the larger topic of data archiving. Data archiving includes separate technologies for file archiving, document archiving, email archiving and database archiving. The purpose of archiving and the generic model of the archiving process is pretty much the same for all of them. However, the actual implementation of database archiving is hugely more complex than it is for the other forms of data archiving. It is also in the early stages of evolution, whereas other forms of data archiving have been around for a long time and the best practices and tools have matured for them.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access