Big data has been the talk of the analytics world for quite a while and shows no sign of slowing down. The promise of analyzing all relevant information, regardless of source, is an incredibly appealing idea. But despite all the buzz and interest, very few big data implementations have moved beyond the core Hadoop use cases of Web analytics, ad click analysis and failure prediction. This limited perspective has a lot to do with the fact that all big data has typically been lumped under the label of unstructured data, thus limiting its potential.
Has this stereotype impacted adoption?
Yes. Putting all types of information including that which is not traditionally structured data under a single label makes it difficult for companies to take a more nuanced view of their data landscape. This is a core requirement to understanding what technologies to bring to bear and, more importantly, understanding what data sources will provide the biggest bang for your big data buck.
Big Data versus Big Content
One of the most important divisions within the big data space is the separation between unstructured data (think unparsed data such as logs or sensor data) and unstructured content (any source where the insight you want is locked away in human-created text). The profiles of these information sources are very different and the tools required to gain analytic insights are very different.
For example, log data can come in huge volumes and high velocity, but a large percentage of it is not valuable at a per-record level. However, by applying analytic algorithms via MapReduce, you can efficiently crunch these data sets down to meaningful analytic insight to reveal overall trends. On the other hand, consider email as a source. Email has a high value per record, it is not typically created at the same volume and velocity and it requires linguistics and text analytics to extract meaningful insights.
Fortunately, the big data conversation is evolving, and the term “big content” is increasingly used by organizations like Gartner and AIIM. This newer term is well-justified, given the significant differences in the information itself, and allows us to have a more focused conversation driven by business-value creation. When companies start to compare the business value of also investing in big content versus just big data, they start to realize that big content is, in many cases, the most important aspect from a value standpoint an “unsung hero,” if you will.
Big Content Drives Big Business Value
You’ll need to investigate and ultimately invest in different technologies to gain business value from analyzing big data and big content, so how do you decide where to start? This is where the value side of the big data conversation becomes very important. Many companies have fallen into the trap of jumping right into one of the various flavors of Hadoop, without understanding how they are really going to get return on their investment. In general, big data brings the promise of inferring future trends or behavior (what will my customer be more likely to buy, when will my system fail next, etc.), which can certainly be valuable, but it takes a lot of raw data to get statistically meaningful insight from these sources.
On the other hand, consider what we can learn from analysis of this example of a single airline complaint email:
By analyzing this single email, what could the customer experience be like the next time the customer calls in? The customer service rep would not only see the transaction history, but also the interaction history that represents the current state of the relationship. Using a BI and data visualization tool, this data can be visually summarized to help the CSR understand the state of the relationship at a glance to proactively offer an entirely new and transformative level of service.
Now consider the value of a lot of emails. In the case of this example airline, they could answer questions such as “Which of our high-value customers are having a high ratio of negative interactions?” “What routes are they flying?” and “Why are they unhappy?” This information could be available visually through a dashboard.
Taking things a step further, in customer experience analysis alone, there are a large number of big content sources of customer interactions (email, CRM case note, call center notes, social media, SMS, company forums, survey comments, etc.). In contrast to big data analysis that produces inferred insight about customer behavior, every record, from every big content customer source contains direct insight into what the customer likes, does not like and what they are looking to buy or not buy.
Finally, big content is everywhere. It is highly relevant to day-to-day business operations and has huge analytic potential.
As you jump into the big data waters, take some time to think about which sources of information contain the insight you need to “move the needle” for your business. You’ll likely conclude that big content is a place to start realizing big value from big data. Here are some suggestions by vertical market to help you get started.