The advent of technology forever changed the management of information. Formerly tracked by files and boxes, with the move to digital, size is now referenced in bytes. This transition is not a mere change in vernacular, but rather represents a growth once unimaginable in a brick-and-mortar setting.

In a 2011 study on big data, the McKinsey Global Institute reported that nearly all sectors in the U.S. economy averaged 200 terabytes of stored data per company (for companies with more than 1,000 employees), with many sectors exceeding one petabyte of stored data.

To put it into context, “Megabytes, Gigabytes, Terabytes…What Are They?” ( defines terabytes and petabytes as follows:

Terabyte: A terabyte is approximately one trillion bytes, or 1,000 gigabytes. A terabyte could hold about 300 hours of good quality video or 1,000 copies of the Encyclopedia Britannica.

Petabyte: A petabyte could hold approximately 20 million 4-door filing cabinets full of text or hold 500 billion pages of standard printed text. It would take about 500 million floppy disks to store the same amount of data.

All predictions point to the continued exponential growth of information as technologies evolve and the network connectivity of devices further enables the collection and exchange of data. Rather than managing the growth, businesses often rely on the ubiquity of storage options and decreased costs to influence decisions in favor of retention.

However, accessibility comes hand-in-hand with externals threats—hackers with no geographical limitations attacking networks to gain access and mine for information. Accordingly, having the ability to store should not be a go-ahead for the unmitigated accumulation of data because the financial impact can be significant and risks severe, as evidenced by the select examples below.

Lost productivity

Simply put, data is useless unless you can find it, and retention of unnecessary data is significantly increasing the proverbial haystack to sift through.

A recent survey conducted on automating information governance by AIIM, a global community for information professionals, confirms, “up to 80% of electronically stored information is ROT (redundant, outdated, or trivial).” Compounding the issues arising from data volume is the nature of stored data and the systems and applications in which the data is stored, all of which frustrate the search and retrieval process.

As IDC reports in its white paper The Knowledge Quotient: Unlocking the Hidden Value of Information Using Search and Content Analytics, 90% of all digital information is unstructured content locked in a variety of formats, locations, and applications and is made up of separate repositories that do not communicate with one another. The result is that it becomes more difficult to track down specific information:

• 61% Regularly access four or more systems

• 15% Access eleven or more systems

• 36% of day spent looking for and consolidating information

• 44% of the time cannot find information

The financial impact from lost productivity, taken as a whole, can be considerable.

As explained by IDC, assuming an average workweek of 41.8 hours with an annual salary of $80,000, the cost assigned to time wasted searching but not finding information is $5,700 a year per knowledge worker. Extrapolating from those figures, a business employing 1,000 knowledge workers wastes over $5.7 million annually to search for, but not find, information.

But over-retaining data not only impacts productivity, it has litigation implications as well.

Discovery pains

For a presentation to the Committee on Rules of Practice and Procedure Judicial Conference of the United States, large organizations (Fortune 200 companies) concerned about the impact of litigation costs on their ability to compete in a global economy surveyed their members. They requested detailed information on long-term litigation cost trends, U.S. and non-U.S., legal transaction costs, and legal fees and discovery costs in “major” closed cases (defined as cases with litigation costs greater than $250,000).

The results of the survey disclosed that the ratio of pages discovered to pages entered as exhibits in litigation is as high as 1000/1. This should not come as a surprise with the retention of redundant, obsolete, and trivial information, as discussed above.

Responding to electronic discovery requests is an expensive endeavor with the increase in electronically stored information by businesses. The largest part of those costs (73%) is related to review for relevance and privilege as reported by the RAND Corporation on its study of litigation expenditures for producing electronic discovery.

Accordingly, it is not uncommon for parties in litigation to move for a protective order to either deny the right to certain information or shift the costs of production to the requesting party. While this determination is case-specific, courts have granted protective orders where the party seeking protection proves that the information rests in a format that is not “reasonably accessible.”

However, with the passage of time and ubiquity of electronically stored information in litigation, business decisions (or lack thereof) leading to retention of information in an inaccessible format may now factor into the reasonably accessible analysis. The United States District Court for the District of Nevada in United States recently addressed this ex rel Guardiola v. Renown Health.

In that case, the issue before the court was whether Renown Health should be required to produce certain emails archived with a third party vendor. In its analysis, the court recognized that electronically stored information (ESI) “is now a common part and cost of business” and the retention practices of the business, as well as decisions on how to store data, have implications that need to be carefully considered in light of the “risk of litigation and corresponding discovery obligations.”

In other words, a party should not be relieved of its duty to produce documents merely because it has chosen a means to preserve evidence, which makes ultimate production of relevant documents expensive. To permit a party to reap the benefits of a chosen technology, while at the same time using that technology as a shield in litigation, would lead to a result that is both incongruous and unfair. Based on the foregoing, the court ordered Renown Health to produce the subject emails.

As demonstrated above, maintaining and managing your information has ramifications from both a business and legal cost perspective, but keeping all that information also places a burden on the business to adequately protect it.

Protect what you have

Data breaches are becoming more common and more devastating. Noted author and expert on cybersecurity and cyber warfare, Peter W. Singer, states, “Ninety-seven percent of Fortune 500 companies have been hacked, and likely the other 3% have too, they just do not know it.”

That means a cyber-attack at your company is no longer a question of “if” but “when” and the failure to implement controls around your information through unmitigated growth will contribute to the severity of the breach.

Take, for example, Sony Pictures, which suffered a data breach of its corporate network in late November of 2014. Per the class action complaint filed with the court, a hundred terabytes of data was stolen with at least 25 gigabytes of that total containing sensitive data on tens of thousands of Sony employees.

According to the various reports that came out at that time, lax security procedures, poor system monitoring, and failure to implement proper retention policies and practices contributed to the types of data compromised that should have been better protected or, in some instances, disposed of as it extended past its lifecycle.

The ripple effects from this breach include the aforementioned class action suit, the resignation of co-chairman of Sony Pictures, negotiation battles with talent, and unknown costs associated with the release of confidential scripts and ideas. But aside from the consequences listed above, data breaches also have hard monetary costs as well.

In the case of Sony, several data security and retention issues were identified in various reports:

• CEO regularly reminded in unsecure emails of his own passwords for his and his family’s mail, banking, travel and shopping accounts;

• Lax security practices, such as pasting passwords onto emails, using easy-to-guess passwords, failing to encrypt sensitive data

• Storing passwords in folders titled “password”

• Retention of virtually everything in email systems

• Significant and repeated outages due to a lack of hardware capacity, running out of disk space, software patches that impacted the stability of the environment, poor system monitoring, and an unskilled support team

An NBC News article published earlier this year reports that the average data breach incident now costs a company nearly $3.8 million in administrative costs i.e., forensics, breach notifications, call centers, and the like. Excluded from this figure are litigation, settlement, and remediation costs, which can substantially increase the cost exposure.

For instance, Target recently settled with Visa for $67 Million arising from the 2013 breach and is working on a settlement with MasterCard.

Taking action

The examples discussed above from a business, legal, and security perspective are representative of only some of the risks and costs involved and are far from comprehensive. However, the underlying constant is that a proper management scheme to govern the information would mitigate those risks and costs. So how do we accomplish that?

Reduce volume

Data is being generated at vast rates and seeking to corral it is a tough enough task without adding redundant, obsolete, and trivial information into the mix. This requires the implementation of a sound information management policy.

The policy should be applied and communicated in such a way that it reduces the burden on employees. This means, where possible, automate the process so that there is less reliance on employees to subjectively determine retention value i.e., retention rules for emails.

Provide clear direction as to when to take action and what action is needed in relation to information. For example, providing a retention schedule that is granular, extending to hundreds if not thousands of record series, is prohibitive to the promotion of adherence and compliance. If the user is not confident in action, no action will be taken.

Lastly, integrate processes into existing user workflows to seamlessly manage information, rather than introducing as new steps.

Know where your information rests

The business needs to understand where and how data is being retained in its environment. This involves, at a minimum, an understanding of where data is stored, who is generating the data, and the methods by which the data is being generated or received. This will provide the foundation to implement strategies around their management.

By way of example, systems and applications where confidential, sensitive, and personal information rests can be identified to implement appropriate controls and security measures for protection. Utilization of particular systems and applications can also be identified to quickly eliminate sources of ESI from e-discovery, where the issues can be tailored to particular persons or departments.

Lastly, inefficient and problematic retention practices can be exposed and addressed.

Assess search and retrieval capabilities

Once the question of where, what type, and how data is being retained is answered, businesses can identify search and retrieval capabilities. On a going-forward basis, structure should be imposed on unstructured data.

Work with your business owners to identify the types of information that are critical from the types of documents they routinely encounter and access. Then, work with the IT team to assess the data fields that need to be accounted for to retrieve.

Lastly, if you do not currently possess the right tools to accomplish this, procure them to support the process with an eye toward automating as much of the process as possible and reducing reliance on employees to populate the metadata.

Risks, costs, and timing of addressing legacy data will require careful consideration. To lessen the burden, review the adopted records management policy and determine if there are any data stored in systems and applications, or if there are systems and applications as a whole, which are subject to immediate disposal due to their obsolescence.

Classify the remaining systems and applications that store data based on the risk profile of the business and address in order of priority. This will be labor intensive, so be sure to set proper expectation of time. In addition, review existing technology; such as file analysis tools to determine they’re suitably in your information environment to assist with the review process.

Simply doing nothing to mitigate the growth of information is not an option as technology continues to progress, external threats continue to mount, and the legal landscape to address ramifications from use continues to change. The steps identified above are not comprehensive, but provide a foundation to begin gaining control over data within the business environment. If left ignored, the unmitigated data will leave your business exposed to the multitude of risks and costs from a business, legal, and security perspective.

(About the author: Soo Kang is general counsel director at Zasio Enterprises, Inc.)

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access