John Myers, research director for analyst firm Enterprise Management Associates, believes the Internet of Things is the shiny new toy of big data, and with good reason. Analytics run on sensor and machine data from smart devices yields quantifiable impact to cost savings and revenue, and CFOs are taking notice. In this interview with Information Management, Myers shares trends uncovered in EMA’s big data research, implications about the future of big data and use cases of big data in practice across a variety of industries and markets.
EMA has conducted extensive research about big data. Would you talk about that research and some of the implications uncovered for the future of big data?
EMA has done research in 2012 and 2013 [and is] planning new [big data] research for 2014. In each of these end-user based research studies and our work with the technology vendors, we are finding that the concept of big data is evolving. In 2012, we saw big data emerging and organizations were trying to get their arms around big data to understand the hype. [Research in] 2012-2013 showed end-user organizations taking that big data and starting to integrate it into their business processes for fraud management, customer relationship management and logistics optimization. 2014 promises to continue this trend with the growth of sensor-based data sources and the Internet of Things.
How mature do you think these initiatives are for most organizations?
It depends. Somebody mentioned the concept of “born digital” to me the other day. The Netflix and Amazons of the world were born digital, and to a certain extent they have a much easier time integrating data, particularly big data and Internet of Things data, into their lives. Companies that were not born digital are having a harder time doing that because they’re still used to structured transactional data. And their systems are built for that, their cultures are built for that, their mindsets are built for that. So, companies that are born digital will have a much easier time integrating, and I think we’re seeing that already.
There’s an exception to this rule, which is General Electric. As part of their manufacturing business, they have [decided] to sensor-enable everything they make jet engines, trains, etc. So this is an example of a company that was not born digital but is saying they can be that key enabler [for digital business], and they’ve made a fine step to go after that. General Electric is my poster child for a company that wasn’t born digital but understands the value of it and is saying, “This will be a priority for us because we feel our customers are going to want that.”
Have any new or surprising trends around big data stood out to you from the research and your work with clients?
The emergence of machine data. This has been in the form of server logs such as Web traffic or geo-location information such as GPS information from smartphones. The future is showing that sensor information from the connected home and sensors on trucks, planes and packages will be the next wave of information sources for big data initiatives.
Why do you think this is such a growth area?
There are three types of data that we have in our study: transactional data, something that comes out of a point of sale system; there’s human-generated data that might be Twitter, a blog or a picture; and then there’s machine-generated data, which is log files, sensors, etc. The reason that machine-generated data swapped places with human-generated data is that from a sensor perspective it’s easy for me to look at the log files that come out of my environmental control system and say “If I raise the temperature in the buildings in the summertime from 71 to 72, I can affect a dollar change and a lowering of my costs.” And likewise in the wintertime, so that instead of [setting the temperature at] 70, I set it for 68. I can track that to a dollar cost savings or I can track it to a revenue generation type of activity, so it’s top or bottom line. Tweets and blogs are still kind of squishy. I can calculate exposure, I can calculate sentiment, but it’s hard for me to calculate actual sales or reduction in costs.
So you’re saying that ROI is somewhat easier to calculate for machine data.
One good example is airplane engines. Fuel is one of the single largest costs in any airline other than flight attendants and pilots. I can’t get rid of flight attendants and pilots because they’re mandated by law, but I can reduce the amount of fuel that I burn or put the amount of fuel that I burn into an optimization calculation that [could hypothetically show] if I reduce my speed by five miles per hour I may have one less flight but I could save one and a half flights of fuel costs. And because you’re putting [sensors] in things like locomotives, refrigerators, washing machines and electrical grids, it’s easier to take that [data] and calculate it to either a top-line revenue or bottom-line cost reduction. CFOs are able to gravitate to this aspect of Internet of Things and big data.
[So] there’s the Internet of Things in the application of big data initiatives. Internet of Things is a use case of big data because it’s multistructured data in real time all over the place.
For all the hype surrounding big data, how much confusion do you think still exists in the marketplace?
There is tons of confusion about Big Data. How big? What shape/structure? Speed of ingestion? Who is responsible? EMA has been able to reduce some of the confusion in our end-user research. We know that an “average” big data environment is around 100TB in size -- not something that you can do with shadow IT, but not unsurmountable for most organizations. We know that data in big data initiatives is structured and multistructured, not just wooly Hadoop data.
EMA’s Hybrid Data Ecosystem [is an information architecture framework that] has eight different platforms associated with it, because we’ve found that people aren’t just using one Hadoop or one Cassandra or one Cloudera to do this. In our research, 65 percent of respondents are using between two and four platforms to achieve their big data initiatives. [People need to know] how to take these platforms -- including the data warehouse, data marts, operation systems, cloud systems, etc.- pull them together and do fraud management, asset logistics, customer relationship management. We have lots of different things that need to go into the ecosystem, not just one platform to rule them all.
The other thing we know is that the budgets are not just IT budgets. We’ve found that almost 50 percent of the projects are getting funding from not IT sources -- that includes marketing finance and sales, which means that these are business stakeholder investments in these initiatives, not just IT doing what IT wants to do.
What areas of big data have the potential to bring the most value to organizations, and what must they do to realize that value?
That depends on the industry. For transportation and logistics, the ability to reduce costs and greenhouse gas footprint in shipping and become a value-add partner for omnichannel retailing will be important areas of growth. For retail, the understanding of customer needs and wants via dynamic offers across channels, and doing so at profitable margins is important. For health care, the ability to take the concept of the Internet of Things and apply it to critical care situations such as neonatal care, ICU patients and everyday health maintenance via [wellness tracking devices] Fitbits and monitors will provide benefit.
What are the biggest challenges companies have with big data initiatives?
The biggest challenge is knowing where to start. Big data and Internet of Things can be specific solutions for individual companies/industries. [It’s important to understand] that big data and Internet of Things data sources are in fact "just" data sources and should be used in association with business challenges, not necessarily data source solutions in search of a problem.