More and more companies recognize the power of analytics as part of their competitive strategy. But most solutions only provide a glimpse of what can be achieved. What is the potential impact when performance barriers fall away? I’d like to explore the possibilities and introduce a few examples of companies leveraging the intelligence in their data in new and unexpected ways. After all, competing is good, but winning is better.

In finance, the term “arbitrage” refers to the ability to find and exploit market disparities (hedging strategies monitoring currency or securities fluctuations being prime examples). Most arbitrage opportunities are very time-sensitive - you have to recognize value in an overlooked stock, then swoop in to buy it before others take notice, get the same idea and drive up the price. On Wall Street, an arbitrage virtuoso who is able to consistently spot untapped potential before others do is worth his or her weight in gold.

Leaping through Tiny Windows

The term “information arbitrage” has many similarities to its finance equivalent, and it’s a good way to think about the impact that analytics can have on a company or even an entire industry. Information arbitrage is about finding game-changing intelligence buried in vast, unappreciated data assets and exploiting it to leap ahead of the competition. Like a financial investor, the information arbitrager takes advantage of an opportunity before the window slams shut (which can be very fast, indeed).

Companies in certain industries make particularly good arbitrage candidates. These are companies dealing with "big data" - tera-scale or even peta-scale databases and a constant flood of incoming data. Telecommunications, e-business, RFID retail applications and online advertising are a few segments that come to mind. Often, the operational data is changing very quickly and key insights are only found at a very granular level. Now suppose this normally takes hours or days, and one company can suddenly do it in minutes, seconds or even subseconds. As many businesses are coming to find out, this kind of intelligence disparity can have dramatic implications both for that company and its market.


For example, telecommunications is a high-volume, low-margin business. Constant changes in network utilization demand real-time decisions about rating and pricing structures for an operator to stay competitive. By running pricing scenarios against billions of call data records and by examining individual customers to determine their current calling patterns and preferences, a major telco provider knows exactly which options to offer each customer. In contrast, competitors might only see that customer as part of a larger segment measured at some time in the past and may come up short with their offers and pricing.


Several challenges to big data analytics make arbitrage opportunities difficult to pursue. Predictive modeling, optimization and other analytic applications are much more processor-intensive than the SQL queries used in standard business intelligence applications. When complex algorithms and gargantuan databases converge with real-time business demands, something usually has to give.


Many companies find they are unable to fully exploit their growing data holdings and have to make do with sampling or high-level summaries rather than the complete, granular data they often want to examine. But using partial or high-level data can be dangerous; even the most powerful algorithms can suggest spurious or meaningless conclusions when they are applied to insufficient data. Companies may also lose hours offloading data from the data warehouse to an external cluster of processors to run the analysis. With all these approaches, the result is an incomplete solution that provides just a hint of the possibilities of analytics because that’s all the current technology is capable of delivering.


Consider the problem of optimization, for example. Optimization solutions play a key role in helping companies target the right customers, make the right offers, determine manufacturing volumes or accurately price products to take full advantage of market conditions while minimizing expenses. Depending on the problem being addressed, an accurate optimization solution needs to account for many variables and constraints such as products, branches, budget, time, contact channels, offer history, market segmentation and privacy preferences, to name a few.


Due to the multiple permutations and combinations among the different elements, even a simplified optimization model limited to only one month of data, 1,000 customers and 10 different offers results in an astronomical solution search space of 2 to the power of 10,000. Just to put things in perspective, the number of atoms in the observable universe is about 10 to the power of 81, only a few more variables away.


The "big math" at the heart of this kind of analysis pushes most processing technology to its limits and beyond. As the number of variables and restrictions increases linearly, the algorithm amplifies exponentially, often reaching the complexity class NP-Complete. As a result, companies are forced to compromise in the thoroughness of the analysis and/or the response time they are willing to tolerate. Most optimization efforts look at small snapshots of the total data available (for example, only the last month’s data) and make use of a range of techniques such as linear, dynamic and integer programming, Lagrange multipliers and cluster analysis that reduce the level of complexity in various ways, all in an attempt to reach an actionable result in a realistic time frame. But even with these approaches, companies are faced with costly infrastructure requirements, incomplete views of their data and lengthy response times, resulting in stale data or missed arbitrage opportunities.

But what if you could bypass the existing performance limitations and get crucial intelligence much faster than before? For example, what if a database marketing company could use complex algorithms to get accurate optimization results days before the market could adjust? Or a retail franchise could precisely adjust the prices of thousands of products daily for each of its stores? Or a credit card company could run customer scoring algorithms 100 times faster than its competitors? Or a financial services firm could run real-time Monte Carlo simulations on terabytes of data to manage risk? What impact could advantages like these have on a business? It’s fair to say the difference would be game changing, providing a major competitive advantage and the ability to enter new markets previously out of reach.

These capabilities are not just marketing fantasies or future visions - they’re in use today.

By utilizing streaming analytic appliances, businesses can make these information arbitrage opportunities possible. Streaming architectures are built for running complex mathematical models on huge data sets with results in a fraction of the time required by traditional technologies. Sophisticated analytic applications run "on stream" in the data warehouse against all the records and detail that need to be examined. There’s no need to settle for summary data or aggregations or to ship data to another system for analysis.

Opening up these architectures to innovative developers in the corporate world and academia will also help create a new generation of analytic applications that were previously impractical, unaffordable or simply impossible. When exploiting an arbitrage opportunity means leveraging big data and big math, a streaming architecture is inherently faster and more efficient than other technologies.

The bottom line is when big data meets big math, great things become possible for businesses, enabling them to:

  • Use information arbitrage to take advantage of time-sensitive opportunities.
  • Rapidly run multiple scenarios and sensitivity analyses in near real time.
  • Make use of all the available data, all the time while their competitors are still struggling with reduced visibility from sampled or aggregated data.

When data warehouse appliances burst onto the scene in the early 2000s, their ability to query giant databases with unprecedented speed upset a lot of preconceived notions about the limitations of technology and what companies can do with their data. Advanced analytic applications take processing complexity to a much more challenging level, and once again the capabilities of a new breed of streaming analytic appliances are revolutionizing the market and capturing the imagination of businesses.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access