I have been hearing from a few people who now think the term “big data” caught on a little too fast. Nobody is fighting it and can’t anyway; the term is embedded in the data lexicon and will stick in magazine, vendor and analyst headlines for a good long time.
That’s partly because the topics around the term are entirely valid and timely. The problem is that “big” has become too confining a dimension for most of the discussion.
Stories and reports are surfacing daily in all shapes and sizes and that’s mostly okay too. Exciting trends in integration, processing, analytics, social, mobile and cloud are pushing the data discussion, marketing’s job is to run with what’s catchy and they know they are onto something. I’m sure they are testing mnemonics for B, I and G as we speak.
What’s happening with big data applied to business technology sounds a lot like what happened to aspirin when applied to pain. Looking back, aspirin came through very well.
The confusion arises when we literalists try to stay on topic. Most all of us know well that size, mass, velocity and complexity, things we believe characterize big data, are all different things. There is no “speedy data” or “heavy data” to turn to.
“Big” is also a very portentous word sitting standalone next to data, not unlike “Black” next to the word "holes," so people expect to be impressed. I recall some readers feeling deflated when we revealed a couple years ago that the finished digital file for the movie AVATAR was less than 3 TB. You mean the master could fit on a hard drive from BestBuy? (It couldn’t then but it can now.)
Just this week I found myself waxing about developments like Esri’s multi-thousand demographic database getting mashed up with Navteq’s road, terrain and building mapping that gathers hundreds of data points and pictures per kilometer traveled. Delivered on a tablet or a desktop, it’s blitz-time support for business location planning. But Navteq’s stunning North American map database is only about 64GB. That’s a thumb drive to a big data geek, though the end result looks amazingly thorough and fast.
It shows how quickly we and our expectations are evolving. Pointedly, last week on DM Radio, our cohost Eric Kavanagh offered a timely headline, “How Big is Big?, Why Big Data Comes in Various Sizes” that turned into a fun discussion. And though the very smart and very experienced people on the show had great observations to share, none chose to give a definitive answer to the first half of the title.
Some folks look at big data as a technology engineering problem and it certainly is that. You can think about the topic as a series of integration and hyperprocessing opportunities that include combined computing delivery channels, cratering CPU and storage costs and massively parallel processing.
In memory technology also fits the bill with scale that is expanding exponentially but is especially notable for frighteningly fast answers. Others will say that in memory is not yet huge enough to be “big.”
What you could call ginormous (or just humungous) data sets are certainly one dimension of the problem. Theoretical data scientists are busily doing their work or finding their calling and I have one good friend down in the Washington Beltway who was flooring me with her own scientific approach and “how big is big” thinking of Internet sized data sets back at the same time toy elephants were being stuffed.
For better or worse, many interested parties are going to say the words big data when they talk about technologies that are fast, elastic, accommodate previously untapped ranges (I won’t call them volumes) of data and reach broadly into them analytically. It is a work in progress.
But in the end it will be a business challenge that will bring different needs to different organizations. It will be the sum of smaller developments that move the bar. I am reminding myself regularly that this is an exploratory frontier that promises much, and promises more every day. It has to steadily be made real to the mainstream, even in little ways, or we will agonize users as much as waterfall development and impressively “big” data warehouses ever did. That’s just how high expectations run these days.
We can hope that most charged with the task of big data will soon find their mission, maybe even more quickly than we think, discover or demand the tools they need and rest assured, their data will be big enough.