What to do About Data Debris

Every day we create 2.5 quintillion bytes of data, and the pace of information creation continues to accelerate. While the potential benefits from mining big data are extraordinary, the reality for most enterprise CIOs is that current information growth is overwhelming the tools and processes they have in place to collect, store, analyze, process, archive, find and delete it.

To make matters worse, most enterprises operate in multiple regions of the world. Whether they sell to global markets, rely on global supply chains or aggregate data from around the world, enterprises operate in a world where regulations, such as EU-focused privacy rights, are complex, continually evolving and even contradictory — since the rules are not consistent across numerous countries and jurisdictions.

Given the overwhelming difficulty of these challenges, it has never been more important for those in record information management to collaborate with CIOs who are investing significant resources in IT infrastructure and solutions. But given today’s economic realities, most CIOs remain under pressure to reduce IT spend as a percentage of revenue. This pressure, however, only exacerbates the data management problem, because the uncontrolled accumulation of data actually increases legal and compliance risk.

Fortunately, there is a solution. Forward-thinking CIOs have discovered that it is possible to control information growth and reduce costs at the same time, by developing an information lifecycle governance program that takes a transparent, enterprise-wide approach to data retention and enables the routine and defensible disposal of valueless data.

The Challenge of Data Debris

At the 2012 Compliance, Governance and Oversight Counsel Summit (see sidebar below), a survey of corporate CIOs and general counsels revealed that typically 1 percent of corporate information is on litigation hold, 5 percent is in a records retention category and 25 percent has current business value. This means that approximately 69 percent of the data that most organizations keep has no legal, regulatory, privacy, security or business value.

Why would organizations keep so much data debris? The legal, records, privacy and security departments and business users know the value of the information they create and store. But, only IT has the ability to permanently delete information, and IT has no insight into its value and is therefore hesitant to delete anything. In fact, many legal departments indirectly rely on this “save everything” approach to prevent the deletion of information that may be subject to a legal hold in the event of litigation or government investigation. Unfortunately, this strategy fails to recognize that retaining unnecessary information actually increases e-discovery cost by increasing review and production costs. Ironically, it also increases risk, since it often leads to the production of information that would otherwise not be a part of litigation.

Retention, Privacy and Security

CIOs saddled with uncontrolled information growth also face the dilemma that some information must be deleted to comply with privacy and security regulations.

Of the more than 100,000 international laws and regulations that are potentially relevant to Forbes Global 1000 companies, many require that certain information be deleted. Even more challenging, these laws are continually evolving, and they often vary or even contradict each other across borders and jurisdictions.

The EU’s upcoming privacy reforms, for example, must be implemented across 27 member countries, introducing significant complexities for companies doing business in these countries. This complexity recently tripped up Facebook, which was found guilty in March of 2012 of violating Germany’s privacy laws relating to the ownership of uploaded data and the use of that information without consent and disclosure. This case has severe implications for any company with mobile apps or social media sites that collect contact information or original content from users. For example, proposed EU Data Protection Regulation would fine corporate violators up to 2 percent of their annual worldwide revenue.

Reflecting the challenge of complying with privacy laws, the Information Governance Reference Model (see Figure 1 below), a project within the Electronic Discovery Reference Model, recently added Privacy and Security as primary stakeholders in the effective governance of information.

Shortcomings of the Traditional Retention Schedule

Whether the problem is over-retention or under-retention, most IT departments fall short on their compliance efforts because IT has no insight into the regulatory value of information. This situation exists because the mechanism for determining regulatory value, the records retention schedule, is severly outdated. Most retention programs today:

  • Remain narrowly focused on a small set of traditional information. They ignore the bulk of today’s digital information — encompassing marketing data, emails, social/mobile media, and significant amounts of log and metadata — which increasingly falls under the privacy laws.
  • Ignore other obligations, such as privacy and security.
  • Don’t account for the business value of information.
  • Can’t survive legal scrutiny because the retention schedule isn’t automatically updated to reflect evolving regulations.

Compounding the problem, most enterprises lack automated processes for communicating and implementing the retention schedule. This further impedes IT’s efforts to determine which information should be retained and which should be deleted. It also makes it impossible for IT to eliminate unnecessary data to drive down costs.

Achieving Defensible Disposal with a Modern Retention Program

The defensible disposal of data debris can have a dramatic impact on information economics. Reducing the amount of information an organization must store means that IT can significantly reduce its spend on storage, servers and backup. Defensible disposal can also reduce e-discovery costs, streamline regulatory responses and minimize the risks of penalties associated with over and under retention.

But to achieve defensible disposal, the retention schedule must be modernized so that it:

  • Establishes a single, transparent regulatory framework and global taxonomy for the entire business, including what information is covered, who is obliged to comply, how retention and disposition are triggered and which privacy obligations need to be observed.
  • Allows for jurisdictional differences by enabling knowledge to flow from local data stewards back to those managing the program centrally.
  • Applies to all information: paper and electronic, records and non-records, structured and unstructured.
  • Reflects the legal, regulatory and business value of information.
  • Is continually updated in real time to reflect business, legal and technological changes?
  • Is automatically and effectively communicated to all information stakeholders and to IT?
  • Is directly linked to the actual systems and repositories that have custody of the information, so that valuable information is automatically retained and valueless information is automatically deleted?

Getting There

Creating a modern, transparent and executable retention schedule requires a unified information lifecycle governance program that enables legal, records, privacy, security and business stakeholders to collaborate with IT on information management. Developing this collaborative approach within a large enterprise requires the right team. An Executive Committee should include the Global Director of RIM, the CIO, CFO, General Counsel and other officers. A Senior Advisory Group composed of line-of-business leaders must ensure business responsiveness. A Program Office should drive and measure progress toward goals and direct the efforts of a Working Group that develops and implements the relevant processes.

In order for organizations to reap the full benefits of Big Data without overwhelming the budget or increasing legal and regulatory risks, RIM must work with CIOs to help determine how to defensibly dispose of the nearly 70 percent of data that has no corporate value. To do this, a modern, transparent and executable retention program must be created to provide IT with the knowledge and tools to automatically and precisely identify which information must be retained, which information must be deleted and which information loses its value over time and can eventually be discarded.

The only way to create this modern retention schedule is for all information stakeholders to work together to align their needs. For more information on how information governance leaders can successfully build the case for change, the CGOC offers two excellent publications, “Elements of the Modern, Executable Retention Schedule” and “Information Lifecycle Governance Leader Reference Guide,” along with many other resources.

For reprint and licensing requests for this article, click here.