Big Data, Little Happiness
In the legends of Greece, Dionysus holds a special place as the god of winemaking, ritual madness and ecstasy. According to the stories, Dionysus once granted King Midas a token of good favor. King Midas wished that whatever he touched should turn into gold. The boon was granted, and to King Midas’ great joy, everything he touched turned into the precious metal. But it was too much of a good thing, and soon fortune turned to misery as he turned his daughter into lifeless gold. Or so the legend says. Perhaps King Midas should have reflected first and asked for happiness instead.
In many ways, Dionysus is alive and well, as companies have been granted the “gold” of big data, to be created upon touch. IT departments are elated, as increasing amounts of data now magically turn up where there seemed to be little before. Manufacturing data? Sure. Customer habits? No problem. Monitoring everything that is still or moving, slow or fast, natural or industrial, man or machine? Slap on an API, link to a service, stream it, store it, replicate, analyze it, visualize it, correlate it, reject it and ask for more.
We are falling into the Midas trap. “There’s gold in them thar mountains of data” cry strategists, data scientists, consultants, service providers and product vendors. Out come the shovels, pick axes, drilling equipment, earth moving machines and data miners. But gold is hard to find, difficult to mine and usually present in smaller quantities than imagined. During the Gold Rush, it is said that more money was spent purchasing shovels and other mining tools than was realized in the discovered gold.
Can data make companies intelligent? Sure. Can it data make companies more profitable, more efficient, more customer-centric and more strategic? Possibly.
Of particular concern is the rate of growth of data capture. More data is collected in one day now than existed in the world just a few years ago. Unfortunately, this speaks only to our ability to capture data, rather than to its inherent utility. This dramatic surge in data is essentially caused as the number of connections that can be made is increasing geometrically between content, users, apps and activities. And in many ways, this is just the beginning. As we advance further into the Internet of Things we will see an explosion of data availability, with data being broadcast from every addressable entity humans, nature, vehicles, machines, factories, drones, sensors, ad infinitum. Collecting any of these will soon be trivial just an attachable service but the consequences of bringing the data into the enterprise will be costly and not necessarily useful.
Collection and storage of data require considerable thought and assessment. Should data be collected continuously or intermittently? Should it be on interaction or transaction? Should it be discrete data or based on statistical events? Should storage be enduring or should it have a defined expiry date? Should data be processed and discarded based on measures of utility derived? Should data stored have an assessed value?
Let’s look at a different industry for guidance. In the world of oil drilling, surveys are conducted of rock formations deep underground. One method is to use survey techniques, such as a seismic survey from the surface into the ground formations and study the pattern of waves echoing back. When the formations underground are understood to have a geological pattern supporting the existence of gas, then exploratory wells are dug for validation. When the excavated cores are found to be promising, then the oil company drills production wells to extract oil and gas.
In contrast, enterprises start collecting data because they can. Data pours in, collected in increasing quantities. Data analytics machines commence work, create graphs, tables, correlation and dashboards, and yet the outcomes remain uncertain. We crunch data streams but miss out on assessing the value of each data stream. We need to be more thoughtful, analytical and definitive going in, before we enter into large scale data collection. This requires assessment of the underlying data structures of the enterprise and establishing models and perspectives on what outcomes they could yield, before big data collection begins.
Five Rules for Big Data
1. Understand Your Enterprise First. Before beginning the initiative, it is important to understand your enterprise. What are the internal and external workflows and information flows that accompany them? What activities generate data? Do your customers respond to your activities or to those outside your enterprise? Does your enterprise react to external data?
2. Model Your Enterprise. What are the models for the key workflows of your enterprise? How does data flow within the enterprise? What are the factors of production? What are the influence channels? What controls are available to initiate action? Are such controls available in isolation, in unison or in sequence? How sensitive are such controls?
3. Model Your Actions. Once you’ve established a model for your enterprise, model your prospective actions. What is the basis for action? What are the data insights that would call for action? What are the anticipated outcomes from actions? What are the time durations for the action and response? How will the impact of actions be analyzed to judge whether actions should be stopped, increased or decreased?
4. Implement the Data Analytics Plan. This is where “big data” collection, analytics and insights come in. Once you understand what is needed from data, it is now important to collect only the minimum needed to prove or disprove the hypotheses you may have for enterprise outcomes. For example, “I would like to give a customer-specific discount at certain times during the day, if I am confident that I can realize new sales.” It is better to start small, work on a single hypothesis, and prove that the data models, control models and enterprise models work. If not, they need to be iterated for improvement. It may be that data insights are interesting, but not actionable. It may be that actions are feasible, but enterprise structures do not allow implementation.
5. Roll Back Collection of Useless Data. It will become clear from the steps above that either the collection of data and its processing is advantageous to the enterprise or it is useless. It is important that data collections are reviewed for utility and discarded if not useful. Enterprises that collect data without a plan for expiry of data will suffer the taxes of data retention, storage and archival.
Executives would do well to emulate Stephen Covey’s principle of “Begin with the end in mind” when it comes to their big data initiatives. These should be seen as part of business strategy rather than as IT initiatives. Viewed strategically and implemented thoughtfully, they will save money on the front end, validate business hypotheses and be a strategic asset for executive leadership.