As our dependence on email has grown, so has our need to find efficient tools to cut through all of the clutter and find what's really relevant. In the Enron case alone, over 12 million documents were seized by investigators and if those pages were printed out, the stack would be over three times taller than the Sears Tower.
Or consider the case of an inventor who recently brought a patent infringement suit against a large corporation. His lawyers had to read over 50 million pages of electronic documents for potential relevance in just a few months. Ten years ago, it would have taken an army of lawyers several years to examine all of the content and assess its relevance to the case. Today, this type of e-discovery project is much faster and cost-effective due to emerging visual analytics tools, which allow legal teams to quickly and thoroughly review sensitive corporate data for litigation, regulatory requests and internal investigations.
In the patent case, a small legal team was able to use visual analytics to convert the enormous set of data into graphics that were organized by the document's concepts. This helped them find the "smoking gun" piece of evidence - an Excel spreadsheet attached in an email - and secure a $500 million ruling in favor of the inventor.
In today's litigious and regulatory environment, companies are growing increasingly aware of the need to have processes and tools for finding and reviewing massive amounts of unstructured electronic documents quickly and thoroughly. The new Federal Rules of Civil Procedure (FRCP) that took effect last December are driving companies to examine how IT and legal departments can better work together to save money and reduce the risk of litigation. Those currently evaluating their e-discovery plans should seriously consider visual analytics because it is uniquely suited to handle these challenges.
Review: the Largest Addressable Cost in E-Discovery
Much of the discussion within the IT industry this year has focused on the collection of data, since in-house technologists are often asked to drop everything at a moment's notice to find terabytes (as many as 90 in some cases) of data for important legal matters. Despite the importance of collection, e-discovery early adopters in highly regulated industries such as financial services, pharma, energy and telecom point to review as the largest addressable cost in e-discovery.
Currently, corporations spend huge chunks of their legal budgets on attorneys billing $300 an hour to review thousands if not millions of electronic documents. In fact, a recent study estimated that U.S. corporations spend nearly $5 billion a year analyzing emails for litigation, regulatory requests and investigations. However, the productivity benefits included within visual analytics is creating a paradigm shift within e-discovery.
A Forrester Research report from December of 2006 echoes these findings, noting, "Tools with visual analytics built in can make these legal professionals more efficient by determining whether or not data is relevant, is privileged, or even needs to be produced in response to a discovery request."1
Visual Analytics: Taming Unstructured Content
Most business users already use some form of visual analytics, since it's grown increasingly common in a number of applications. Leading enterprise resource planning (ERP), customer relationship management (CRM) or business intelligence (BI) tools, for example, are likely to have pie charts and other graphical representations available for users to quickly grasp a high-level overview of the data and evaluate revenue trends. The availability of visual analytics handling unstructured content, however, has lagged behind ERP and CRM solutions because it has been easier to create visual representations from numbers than from unstructured content found in email.
Sophisticated visual analytics tools that can identify the nouns and noun phrases in a series of messages, then visually cluster the documents together according to similarities in subject matter, are revolutionizing the e-discovery review process. The software can contextually assess these nouns and noun phrases to distinguish words with different meanings, e.g., when the word "diamond" is referring to baseball, as opposed to an engagement ring.
Once the content is analyzed, the software can visually represent the key concepts included within the data by clustering documents around similar ideas. These tools can also provide a timeline and social networking view so that users can see which company employee emailed with whom and when. As analysts click on clusters and read the documents, they can quickly tag them as privileged or responsive to the case.
In minutes, visual analytics enable identification of correspondence, evaluation of its substance, an understanding of the parties involved, and metrics on the volume of their interaction. This ability makes visual analytics ideally suited for e-discovery because it addresses two of the main challenges for corporations involved in litigation today - speed and risk.
Speed: Early Case Assessments and Increased Productivity
Periodically, business cases grab headlines when embarrassing emails are entered into evidence. Consider a Massachusetts class-action suit brought a few years ago over the dangers of the diet drug Phen-Fen. During discovery, the court permitted admission of an e-mail from a defending company executive in which he wrote, "Do I have to look forward to spending my waning years writing checks to fat people worried about a silly lung problem?" In addition to damaging a company's brand outside of the courtroom, these email revelations can lead to runaway verdicts and stiff penalties.









