Leveraging the value of text-based data by applying text analytics can help companies gain competitive advantage and an improved bottom line, yet many companies are still letting their document repositories and external sources of unstructured information lie fallow.
That’s no surprise, since the application of analytics techniques to textual data and other unstructured content is challenging and requires a relatively unfamiliar skill set. Yet applying business and industry knowledge and starting small can yield satisfying results.
Capturing More Value from Data with Text Analytics
There’s more to data than the numerical organizational data generated by transactional and business intelligence systems. Although the statistics are difficult to pin down, it’s safe to say that the majority of business information for a typical company is stored in documents and other unstructured data sources, not in structured databases. In addition, there is a huge amount of business-relevant information in documents and text that reside outside the enterprise. To ignore the information hidden in text is to risk missing opportunities, including the chance to:
- Capture early signals of customer discontent.
- Quickly target product deficiencies.
- Detect fraud.
- Route documents to those who can effectively leverage them.
- Comply with regulations such as XBRL coding or redaction of personally identifiable information.
- Better understand the events, people, places and dates associated with a large set of numerical data.
- Track competitive intelligence.
To take advantage of these opportunities and to capture the value that may be locked inside non-numerical data, many companies are turning to text analytics. Text analytics is the practice of aggregating, exploring and teasing answers from textual data to understand what’s happening in the business and drive strategy and performance. It comprises the skills, technologies, applications and practices to continually gain insights to what’s driving business outcomes. The analysis of textual data is part of a larger suite of business analytics disciplines, such as clustering, affinity grouping and optimization scenario modeling. Using text analytics can complement and round out the analytics process by: retrieving documents to process, classify and organize; extracting information from the results; and analyzing this information together with the associated structured data to derive business insights.
Combining text analytics with the analysis of structured data can help companies leverage the untapped potential of unstructured data by facilitating and automating the analysis and interpretation of language. Text analytics applies linguistic and statistical techniques to extract concepts and patterns, transforming language into data and unlocking meaning and relationships.
For example, consider a company that wishes to use text analytics to build an “early warning system” for identifying factory issues based on warranty claims, technician reports and data from customer contact points, such as emails and phone calls. Such a project could result in savings on warranty claims and an increase in customer satisfaction.
The market potential of text analytics is explosive. According to a study by InformationWeek magazine, the software and service text-analytics revenues now total $835 million globally, with growth rates anticipated to be between 25 percent and 40 percent annually for the next few years.
Leveraging the Potential of Text: Approach with Caution
While the opportunities to reap value from text analytics are sizable, it’s wise to take a measured approach to pursuing them. The vendor market is volatile and competitive, and the field is in the early stages of the innovation curve for many business applications. Furthermore, many companies do not yet have a basic document and data management strategy in place, which is a prerequisite for leveraging data with text analytics.
Also, the skills to work with text analytics tools are not widespread yet and some have a steep learning curve. So, if one text analytics challenge is solved, it does not mean that the organization is ready to take on another, especially if the solution requires domain customization of any kind. Moreover, different business problems require different technologies and approaches; there is no single solution for all text analytics challenges.
A pragmatic approach would be to implement a text analytics solution with a vendor alliance model, in a more mature area, as discussed below, or as a strategic or business goal assessment. Another alternative would be for a data readiness/document management assessment/initiative. In short, start small, learn from successes (and mistakes) and implement text analytics in critical-need areas to establish value throughout the enterprise.
Text Analytics in Action
Typical text analytics activities include standard approaches, such as search-based applications, information capture and information management. However, the newest use of text analytics is content analysis to feed enterprise applications, such as BI and customer relationship management. Also powerful, but less frequently discussed in the marketplace, is the idea of combining the output of text analytics with structured data for more effective analytics solutions.
A typical text analytics initiative begins like any analytics initiative, with the establishment of a clear business problem to be solved, visioning an end state and aligning the proposed solution and approach with broader organizational strategy. Metrics should also be established to measure the impact of the implemented solution(s) and to demonstrate the expected ROI. However, tangible ROI may be difficult to measure. Companies that implement text analytics are often treading on new analytics ground and there aren’t many models to guide valuation estimates.
Since they’re often in uncharted territory, many companies choose more mature application areas of text analytics for their first project. These areas include customer relationship analytics and some targeted applications that combine the output of text analytics with existing structured data to perform advanced analytics.
Examples of Text Analytics Projects
For example, a software manufacturer could mine textual data, such as product documentation, plus structured data, such as project metrics or call volume, to spot emerging issues associated with a new software release. Armed with alerts generated by analytics applied to the mined information, they can develop a strategy to address these issues before they propagate to a large numbers of users. Such an early fix could save millions on support costs and could go a long way toward maintaining good customer relationships.
In a more sophisticated, real-life example, one insurance company applied text analytics combined with traditional structured data analytics to construct a predictive model for claims costs. The company wanted to improve its estimation of claim costs at first notice of loss in order to focus on reducing high-cost claims expenses. The company worked with consulting and vendor partners to build a predictive claims-cost model. Initially, the model used only structured data, but when key words from the unstructured textual claims description were added as model features the predictive performance jumped by 14 percent. This performance gain helped the company assign its most experienced agents to its most complicated claims, which resulted in significant reductions in claims expense.
Challenges in Implementing Text Analytics
Although text analytics presents unique opportunities to capture more value from data, it also presents challenges -- some related to its novelty and others related to the difficulty of working with unstructured content. Also, some of the challenges with a typical analytics project are magnified when text comes into the picture.
The first challenge is actually working with the data. Access, storage and retrieval of textual data may require a different approach than structured data. The data is likely to be highly variable, noisy, uncontrolled and heterogeneous, and privacy issues may come into play, especially if email is involved. Finally, query languages focused on semi-structured and unstructured data require a new skill set for analysts to master.
Next, there would be a learning curve to navigate the new text analytics toolsets on the market. They come with new vocabulary, including document categorization, information extraction and sentiment analysis. Scalability is likely to become even more of an issue than with structured data due to the large volumes of text that often need to be analyzed. Finally, integrating textual data rules and taxonomies with existing enterprise applications and business rules may be needed to obtain the most effective outcomes.
To mitigate the impact of these challenges and reduce the effect of the learning curve to the organization, a practical course may be to implement a text analytics initiative with a limited pilot project. Likely pilot areas would be those that have large volumes of readily accessible textual data, such as customer analytics. Once the value of using text analytics is shown with the pilot, the initiative can be expanded to other areas, with the benefit of knowledge gained and successes won.
Wrapping it Up
The majority of a typicalbusiness’s stored information is in an unstructured, textual format. Most businesses do not leverage this information to improve their bottom line. Opportunities for capturing early signals of customer sentiment and financial outcomes can first emerge as textual data. Text analytics provides automated, repeatable solutions that identify useful information hidden in those unstructured documents.
The use of text analytics has great potential but requires moving carefully for there is not a one-size-fits-all solution to the business problems. It also requires that business needs should lead the initiative, instead of investing in the technology simply because it’s new.
Finally, the skills and technologies needed to succeed with text analytics are still evolving, so talent may have to be recruited or retrained. However, by starting with a well-defined problem and demonstrating its value and impact to the business, opportunities will emerge and can be leveraged. These new areas could include contract performance analysis, R&D support, workplace safety analytics and drug or medical care safety analysis. The potential is virtually endless.
This publication contains general information only and Deloitte is not, by means of this publication, rendering accounting, business, financial investment, legal, tax, or other professional advice or services. This publication is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor. Deloitte shall not be responsible for any loss sustained by any person who relies on this publication.
Deloitte refers to one or more of Deloitte Touche Tohmatsu Limited, a U.K. private company limited by guarantee, and its network of member firms, each of which is a legally separate and independent entity. Please see www.deloitte.com/about for a detailed description of the legal structure of Deloitte Touche Tohmatsu Limited and its member firms. Please see www.deloitte.com/us/about for a detailed description of the legal structure of Deloitte LLP and its subsidiaries. Certain services may not be available to attest clients under the rules and regulations of public accounting. Copyright 2010 Deloitte Development LLC. All rights reserved. Member of Deloitte Touche Tohmatsu Limited.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access