Meanwhile, organizations are tackling the challenges of big data, handling more data in more formats that arrives and changes more quickly than ever before. As this tidal wave of data comes crashing down, traditional approaches to storing and managing data are being challenged. So, too, is the mindset that data must be queried and reported or visualized to be used. This combination of predictive analytics and big data should create new opportunities for more specific, future-oriented analysis of large amounts of data which, in turn, leads to actionable insight.
Yet most organizations are not combining predictive analytics and big data in this way. The usability and functionality of both predictive analytic and big data technologies have improved dramatically in the last few years and technology incompatibilities are not the problem. Many predictive analytic tools support access to a wide range of data sources, including those typically branded “big data,” such as unstructured text , or semi-structured Web logs and sensor data. The problem is that organizations are trying to apply these technologies to the wrong problem.
When large investments must be made in new technologies, new approaches and even new roles (such as data scientist or data miner), organizations worry about showing a return on this investment. When they look for problems to solve, they focus first on the big hairy problems figuring these will offer the best return if they can be solved. When it comes to data and analytics, organizations are no different. Realizing that better data and its subsequent insight can improve decision-making, organizations seek to influence big, strategic decisions. They try to use data and analytics to answer questions, like should we acquire this company, continue the development of this blockbuster drug or sponsor the Olympics.
Instead, they would do better to focus on operational day-to-day decisions. These are the decisions about which companies accumulate big data volumes and where new data sources are proliferating. These are the decisions that are best suited to improvement using more advanced analytics, like predictive analytics. Plus, the cumulative value of these decisions is often far greater than the value of those one-off decisions. Putting big data and predictive analytics to work on these decisions requires decision management.
Operational Decisions Drive Big Data
Where does big data come from? Primarily, it comes from the increasing digitization of businesses and of life, in general. Every interaction now creates data about what we did and where we did it. From a business perspective, it is our day-to-day operations that create this data, as we record more information about our customers, our products, the transactions we process, our machines and vehicles, and the goods that flow through our business. This data can be structured data, stored because we use enterprise applications to process our transactions, or semi-structured data Web log data from our websites or sensor data, as products move through the supply chain. It can also be unstructured data, found in emails or call center notes; increasingly, even phone calls and security video are being recorded. All this data is operational data.
In addition to the data we capture in our systems, we have access to a rapidly expanding array of new data sources as cloud-based data services become prevalent. Demographic and behavioral data about consumers dominate these data sources. In the past we could get a summary of the kinds of people in our stores; now we can get purchase history, demographic and other data about individual consumers or tiny clusters of them. More such sources are coming online all the time, contributing to the variety and volume challenges of big data and creating a richer picture of our customers and their world.
As far as data is concerned, whether it comes from our own systems or from third-party data sources, our day-to-day operations are where big data happens. Huge volumes and a wide variety of data are available. The increasingly real-time nature of our systems is also driving this data to arrive and change more rapidly, especially in an operational context, completing the volume, variety and velocity picture. Big data is operational data.
Predictive Analytics Add Value to Operational Decisions
If big data is focused on our operations, what about predictive analytics? Predictive analytic techniques generally take large amounts of historical data and detect patterns in said data that can be used to predict the future, often by assigning a probability to how likely something is to occur. A wide variety of techniques can be used to build predictive analytic models:
- Neural networks can assess how likely it is that a credit card transaction is being performed by the cardholder by evaluating how close this transaction is to the patterns predicted by that person’s past behavior.
- Regression models can determine which characteristics of a customer make it more likely that they will churn, or attrite, enabling a calculation of the risk of future churn.
- Response models can predict how likely a particular person is to respond to a particular marketing offer, based on the success or failure of offers made in the past.
- Predictive scorecards can determine the likelihood that someone will fail to make payments on his or her loan in the coming year.
These predictions of risk, fraud and customer opportunity are all created from large amounts of historical data. To build these models, you need not just data, but data over time. Data changing over time reveals behavior and patterns. As noted above, we have the most data for our operations. Therefore, our operational environment is where we have the data we need to build predictive analytic models.
One other note: it is often true that, in order to build solid predictive analytic models you must experiment. For instance, it is hard to build a response model to see how people might respond to a particular offer if you have never tried that offer on anyone. You need to conduct an experiment to see how it is received by your test group before you can build a model for your whole customer base. Your operational environment is where you can best experiment because large numbers of interactions allow you to try different approaches and gather experimental data.