Data science underlies everything the enterprise now does
Data has been king for well over a decade by now, but the way we use it is undergoing some serious change. Gone are the days of awe at pretty charts and heat maps. Gone, too, is any patience for analytics unaligned to action.
In the enterprise, data science is no longer restricted to reporting duties in the c-suite. It’s now being integrated into every function of modern industry imaginable.
Key developments in the business applications of data science over just the past year include:
· The rise of “representative data” — data preparation, rigorous analytics, and data science to identify insights and understand business issues.
· Mainstreaming of machine learning and predictive analytics — now integral in business, customer, and engineering applications.
· Rapid spread of computational deep learning initiatives — operational beyond just the big internet companies, especially for specialized applications (such as fraud in the banking system).
· Innovation in engineering analytics – especially notable in IIoT applications, where automated anomaly detection is foundational.
· Customer analytics maturing into a consistent discipline — with segmentation, propensity, affinity, geospatial, and loyalty analysis continuing their mainstream usage and evolution.
· Significant uptick in “systems of insight”— where insights from analytics are transformed in to notifications, alerts, and actions on the business.
· Continued migration to “governed” data discovery across the corporate landscape — self-service analytics is still “hot,” but is now generally chaperoned with guidance and best practices, along with more structured performance, governance, and security.
· Beginnings of hybrid cloud adoption with scalable tenant resources and contextual routing, along with hybrid data and elastic compute engines — the brave new world of data in motion.
In the coming months, there’ll be even more activity in all these areas, especially in real-time streaming analytics for rapid intervention at moments of truth in business processes. Data science really is different in 2017.
From Data to Insight
The world is not lacking in data. The “big data” movement has focused on collecting and storing vast swaths of data in the hope of transforming business operations. But organically collected data, while cheap and easy to obtain, is often light on useable information, doesn't represent the business problems envisaged, and is difficult to assemble for analysis. Businesses are starting to address these issues and have renewed focus on the importance of data quality, representation, and preparation for analysis.
In order to address a business problem, we need a business question and an understanding of a business process. We need data that are “representative” of the business problem, and tools to help distill these data into useful insights. New connected technologies, such as sensors and measurement devices, enable collection of more data; and some of these data help address better representation. But the associated “data wrangling” — unifying and standardizing all the collected data from disparate sources to ready it for analysis — requires care and creativity.
Figure 1. Data representation of business problem.
Let me illustrate with a common problem in the Telco industry — subscriber churn. Say we’re a Telco provider and we want to understand why our customers are leaving prepaid and postpaid plans (so we can institute business processes to retain them). The convenient “big data” are our subscriber call records, modem, cable, phone, and network statistics. These data accumulate at a rapid clip, and are available in our logs and data stores. With data discovery techniques, we can wrangle these data, along with some customer attributes, to assess network effects on subscriber churn.
However, a big insight that emerges from churn analyses, beyond call quality and modem stats, is the effect of the subscriber’s personal call network. Through immersive data discovery and data wrangling, we find that the amount of time a subscriber spends speaking with other subscribers who churn across a similar timeframe is the best predictor of subscriber churn (deciding to leave is contagious and spread by conversational contact).
This “churn-chat” insight helps improve business efficiency. An associated churn-chat data feature (column) can be included in operations dashboards and predictive models, along with network effects. These features and models can then be incorporated in to streaming analytics applications — informing sales, marketing, call center, and support actions to mitigate subscriber churn where it is predictively indicated.
Figure 2. Subscriber call network. Subscribers in orange have churned, affecting their calling circle.
Obtaining and wrangling representative data into features that define insights on business processes is at the core of today’s data discovery and analytics — it’s a big leap from raw data to business insight, but it’s exciting and delivers tangible results for the enterprise.
From Insight to Action
All data begin as real-time events. These data are brought to rest, where immersive data discovery can identify insights on key business problems. However, such insights are perishable, and need to be acted upon quickly to drive business value.
The data features that characterize business insights are a crucial component of an efficient business operation. Our sample Telco “churn-chat” feature is information-rich, and can be included in KPI dashboards for monitoring business status and in models that form the basis of actions in business applications. Such actions may include alerts to our call-center regarding change of customer state, or interventions directed toward the subscriber to retain loyalty.
In order to “execute” an insight and affect a business process, we use a deployment environment that converts the data discovery insight into business action. This typically includes:
· Data stream ingestion and processing including “real-time” data wrangling (in our example, say, calculating percentage of time spent speaking with other classes of subscribers).
· Model and rule execution engine(s) to predict or classify state (customer churn-risk prediction).
· Software standards between data science and DevOps stakeholders, enabling model/rule management, observation, and reporting.
· Application invocation related to the real-time state (your business applications, BPM case management system, call center script, SMS/email notifications).
Both the data discovery analytics and deployment environments are subject to the usual DevOps and IT requirements (standards-based, secure, manageable, scalable, observable, and extensible).
Figure 3. Systems of Insight: Real-time analytic applications.
You’ll note that this model represents a significant broadening beyond traditional big data/analytics functions. Such task alignment and comprehensive integration of analytics functions into specific business operations enable high-value digital applications ranging far beyond our sample Telco’s churn mitigation — cross-selling, predictive and condition-based maintenance, fraud detection, price optimization, and logistics management are just a few areas where data science is making a huge difference to the bottom line.
How huge? In Wind Turbines and Democratized Data Analytics: Turning Chaos into Order, Vestas Wind Systems of Denmark noted that its new analytics-driven, condition-based maintenance applications supply measurable benefits: “In 2008, Vestas’ Lost Production Factor was 4.4 percent. Last year, it was 1.5 percent, compared to an industry average of 3.6 percent. For Vestas customers, this translates into a savings of €150 million.”
With numbers like that, it’s no wonder that enterprise data science is on the move.