It’s true after all: size doesn’t matter. Big Data really is about the ability to analyze and act in real time (fast data), using structured and unstructured data from sensors, transactions, and interactions from inside as well as outside the organization.
This ‘Big & Fast Data’ can be used to solve tougher business problems, improve processes, create more competitive advantage, and make more informed decisions in a tightly connected world. It can even be sold as an product. Much more the focus is on to create ultra-fast insights, often within one single CPU cycle, to be used by the businesses or even automatically. If there’s no longer a need to wait, the opportunities for radical business reinvention are limitless.
Werner Heisenberg was a German physicist and one of the key creators of quantum mechanics. In 1927 he published his uncertainty principle, for which he is best known. It states: “It is impossible to determine accurately both the position and the velocity of a particle at the same instant.” Position is the identification of the relative location, in other words: where you are. Velocity is the speed and direction, in other words: where you’re going. It’s rumored that Heisenberg went for a drive one day and got stopped by a traffic cop for speeding. The cop asked, “Do you know how fast you were going?” and Heisenberg replied, “No, but I know where I am.”
The same applies for many organizations today. They know where they are (or have been), but they often don’t know where they’re going. The main reason? Their data is ‘at rest’ even when it’s ‘Big.’ It is mostly inactive data stored physically in any digital form – for example a database or a data warehouse. It may also be used primarily for historic reporting or analysis on mostly internal data by the IT department. Although the quality is high (data warehouses are often associated with a high level of reliability and a single version of the truth), the time to market is often low - often involving batch-oriented overnight architectures - and the value is accordingly low as well.
To be truly successful, organizations must transform from hindsight analysts to foresight action takers. This implies that data should be no longer at rest, but in motion or - even better - in use. It should flow in real time through the organization, changing business outcomes on the fly. Big Data is about where you are (position) and where you’re going (velocity) with speed as the deciding factor.
Data streaming technology from vendors like SAP Event Stream Processor or the Informatica VIBE data stream allows enterprises to collect and deliver small data packages accumulating into one large ‘Data Lake’. Software like IBM Streams or the SAS Event Stream Processing Engine, brings complex analytics to operational data, creating faster insights and interactive visualizations to support business decisions. The in-memory SAP HANA platform supports lightning fast analytics on operational data as well and Teradata support the need for speed with its Massive Parallel Processing Database appliances.
Big Data is clearly not only about volume, as the name initially suggests. Volume is still ‘data at rest’ – even when storing massive amounts to ultra-low cost in the newest Hadoop environments like those of Cloudera or Hortonworks. Sheer speed can already make a significant difference. A software infrastructure provider for telecommunications wanted to provide its clients with better real time information on the clients' customers, especially regarding their location. This would allow them to send real time targeted marketing offers, in order to reduce churn. By connecting Cloudera's Impala MPP SQL Engine for Hadoop directly to MicroStrategy, the production of crucial reports was decreased to just a few minutes, rather than the several hours it took in the past. Sometimes, that's 'real time' enough.
In order to be competitive in increasingly complex business environments though, organizations also need to be able to predict future outcomes, like customer behavior. This has to be done based on all the available data, historic and current. Big Data created a paradigm shift in the way we look at decision making today. Traditionally, structured data from internal systems like ERP had been the main source for corporate intelligence.
Now, unstructured data comes from sensors in machines, planes, trains, automobiles, or even your fridge. It allows companies to optimize their client’s travel or create a predictive shopping list for the people in the supermarket, or use smart meter data to warn consumers about their behavior, all adding to the amount of data available. This is also where external data from websites or social media can tell enterprises about their own performance, about their brand, products and services (like Unilever did). Not with facts or dimensions from the IT data warehouse, but with engagement on social channels by customers.
We live in a time where Facebook can predict if someone is about to cheat or commit suicide, where Google can predict a flu outbreak and retailers can deduce that sombeody's daughter is pregnant. Governments open up their own data archives and actively support people to leverage their APIs in nothing less than an Open Data tsunami.
It’s important to evaluate the data – through advanced analytics for example with SAS, Matlab or R (Microsoft thinks so as well) – but even more crucial to act before an event takes place.
A credit card transaction of a European citizen in South America shows possible fraudulent behavior. Do we block the card right away? An online retail competitor changes the pricing for their three most popular products. Do we change our pricing policy in real time? A railroad switch suddenly reports increasing energy consumption. Do we proactively perform asset maintenance?
When ‘real time’ becomes real – with no more need for waiting – the event and the action become one. And not necessarily in that order. It’s not unlike that famous movie Minority Report, in which the police force uses data to predict when and where a crime will take place and send Tom Cruise to the scene proactively, before the axe falls.
Talking about new business models.
Just imagine what this could do for your organization. Say my name: Big Data, it’s Fast Data.
(About the author: Ron Tolido is an analyst with Capgemini)
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access