Lately the big data big shots have been stealing the spotlight from every corner of IT. And why not? It is truly ground-breaking technology.
It enables the collection and processing of Amazon-sized streams of all types of data and amazing analytics capabilities to provide businesses with the answers they have wanted for years. We are talking football field sized data centers with cheap hardware united in the common purpose of making sense out of all the data we can possibly collect about our company, customers or the world. Some will tell you it can literally predict the future by carefully studying and modeling past trends. And after all, who doesn’t need to know that tomorrow at 3:05pm is the most likely time that males ages 29 38 in the Northeastern U.S. will have an urge to buy red ties instead of blue ones, thus prompting a price change to maximize profits?
But once we all get back to planet earth, the majority of us go back to our rather average-sized data projects, perhaps feeling like we can never make our data talk to us like Facebook and Amazon have done. Companies like Ebay, Google and Walmart adopted big data technology because they actually had systems of incredible amounts of data. They had a real challenge to solve. But the truth of the matter is that the majority of systems out there just are not that large. I attended a conference recently where the discussion was around analyzing big data, but the data sets being discussed were a few hundred gigabytes or a terabyte or two at most. That’s large, but modern processing systems can handle terabytes with ease.
The real value coming out of the big data hype machine for most businesses is better analytics on a more “local” level, causing users to try a bit harder to gain insight from the data already available, in the format it's available in. Entire teams have formed within large companies for the sole purpose of mining data for lost dollars or inefficiencies. It is practically standard now for independent software vendors to build ad hoc analysis tools into their products in order to put the power of better analytics into the hands of business users.
The increased attention on big data and business intelligence has punched the accelerator for these companies to start using their data well beyond its operational value in order to make better business decisions. These systems have moved up the technology ladder from providing simple query capabilities, to complex reports that take weeks and a DBA degree to develop, to including self-service drag-and-drop analytics that allow a business user to answer his or her own questions. Companies with systems that do not have an easy way for business users to perform ad hoc analysis and augment that with data visualization are limiting the potential value of the system. For the majority of enterprise systems out there, we should be able to realize that value without Hadoop, Hive, Oozie, Sqoop, HBase, ZooKeeper, Pig and the rest of the animal kingdom.
Now before all my good friends at IBM, Hortonworks, Greenplum and others fill my inbox with hate mail, I’m not discounting the value of big data initiatives altogether. In fact, I see real value in having systems be an additional feed into larger big data initiatives, thus increasing the overall business value of those solutions. Let the good work of the big data companies, along with their associated marketing messages, inspire you to reflect on how you are analyzing the data you already have. While the data scientists at your company are tinkering with these new big data technologies, you can be getting to know your data better using more accessible analytics technologies today.
The views in this commentary do not necessarily represent those of Information Management or SourceMedia.