Well I do. One of my favorite scary movies is “The Sixth Sense.” There is a famous scene in this movie, where a horrified child admits to his friend that he sees dead people. Do you recall it? While hiding half of his face under a blanket like a toddler, he whispers: “I see dead people, all the time. They are everywhere.”
That scene reminds me of some of the executives and managers that I often cross paths with when I ask them about their data analytics practice. The look on their faces brings me right back to that movie scene, but instead they say: “We see big data, all the time. It is everywhere.”
And then part of me wants to be sympathetic, hold their hands and say: “Yes, I know dear. Big data can be scary. Fear no more as I know some people that can help you. They have cool colored promotional paper saying so!” But then the auditor and data geek in me makes me quickly snap out of it and probe them: “Are you sure what you see is actually big data?”
I call this “Big Spooky Data Syndrome.” See, if your data doesn’t make you feel like you are standing on slippery stones in a rushing torrent, trying to catch a fish bare-handed, with no idea whether there is fish in it at all; then, my friend you, most likely DO NOT have big data, at least not yet.
As in a torrent, volume (or size) is just a part of it. Let’s recap the definition of ‘torrent’: torrent
Now let us adapt it to big data: A strong and fast-moving stream of multi-structured data.
“You are going to need a bigger boat.” – Jaws (1975)
Not sure yet? Muzamil Riffat touched on the topic in his article “Big Data – Not a Panacea” for the ISACA Journal by covering the characteristics of big data, which are also known as the “Vs” of big data: volume, velocity and variety. Since then, IBM coined veracity as the fourth V, referring to the uncertainty of the data.
And to assist you even further, here are few questions to help you understand if your company has indeed big data based on these “Vs”:
☑ Is it too large for a MySQL database?
☑ Is your data a Frankenstein, spread over multiple files, servers and/or geographical locations?
☑ Does your data have a much longer and uncertain life span?
☑ Can you easily recover your data in the case of corruption without having to re-perform any transformation?
☑ Does it contain audio, videos or images?
☑ Does it require immediate response, like high-frequency trading (HFT)?
☑ Is it being generated in real-time, like social media platforms, IoT or other internal sensors?
☑ Do you need a transformation tool to make it identifiable and legible?
☑ Do you need a reduction tool to make it more manageable?
☑ Is the range of potential correlations and relationships between disparate data sources too great for any analyst to test all hypotheses?
Furthermore, since big data is noisy, highly interrelated and unreliable, machine learning techniques are most often applied instead of data mining techniques. And this is why data scientists are often required.
So if you don’t have big data, what do you have? Small data? Hold your horses, cowboy, the term small data is now being used to describe a new breed of data. As Deborah Estrin stated at TEDMED 2013, "Small data are derived from our individual digital traces. We generate these data because most of us mediate or at least accompany our lives with mobile technologies. As a result, we all leave a 'trail of breadcrumbs' behind us with our digital service providers, which together create our digital traces."
In summary, big data is about machine and processes while small data is about people.
Therefore, shall we here agree in calling non-big/small data just organization data? Or enterprise data, if you will?
With all the buzz around big data, it is understandable that many still get confused about the term and conclude that: (1) their data classifies as big data, and (2) that a high-science big data solution must be the only legitimate way to approach data analytics.
And since deploying big data analytics can be daunting and expensive, analysis-paralysis is often the outcome causing companies to completely overlook and under-leverage existent enterprise data with much easier to deploy analytics.
“You cannot run from this, it will follow you. It may lay dormant for years. Something may trigger it to become more active and it may over time reach out to communicate with you.” - Paranormal Activity (2007)
Enterprise data analytics can still deliver a lot of value, since the data already exists it is in good shape and well understood. I believe companies that first mature their current data strategy, governance and enterprise data analytics have greater chance in succeeding with big data analytics later on.
And even when companies do have big data, I personally believe that many are being bullied by the hype and vendors onto moving too quickly to adopt big data analytics. They show up at your door offering the dream to enable better decision making and competitive advantage. And like the Borgs in a good old Star Trek episode, they assimilate you because, without proper knowledge, resistance is indeed futile.
“The box…You opened it, we came.” – Hellraiser (1987)
News flash: Better and smarter business decisions aren’t guaranteed, no matter the size of your data. Having all the data and more of it doesn’t do much good if one isn’t asking the right business questions or simply doesn’t understand underlying assumptions–not all numbers are created equally; some are more reliable than others. But this is a subject to be explored in a separate future article.
A premature adoption of big data analytics can cause way more damage as it can introduce additional risks, such as privacy. Not all companies are equipped to make use of big data analytics. Some may be missing key skills in their existing personnel, or they may be missing critical portions of the technological ecosystem.
In summary, my point is that big data analytics might not be applicable to your organization, and if it is, don’t be bullied into adopting it right away as it isn’t mandatory. However, not doing any sort of data analytics in this data age would be the same as continuing to stock buggy whips once the car has been invented.
And If you need help starting it up, as I mentioned previously, I know some savvy people with cool colored promotional papers that can help you out.
“Whatever you do, don’t fall asleep” – A Nightmare on Elm Street (1984)
Davenport, Tom; “Big Data vs. Small Data Analytics,” MSI.org, 3 December 2012
Dell’Anno, Vince; “For Businesses, It’s Worth Jumping Into the Big Data Torrent,” Wired.com, 25 September 2014
IT Business Edge, “Five Ways to Know if Your Challenge Is Big Data or Lots of Data”
Pollock, Ryan; “Beyond Big Data vs. Small Data: how to get to Smart Data,” GroupSolver.com, 29 May 2015
Riffat, Muzamil; “Big Data—Not a Panacea,” ISACA Journal, Volume 3, 2014
(About the author: Karina Korpela is a senior manager with MNP, a data analytics and data security expert, and a member of the ISACA. This post originally appeared on her ISACA blog, which can be viewed here)
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access