Don Campbell, IBM's CTO for analytics and performance management, sees opportunities for IBM to be involved in the bigger picture of information commerce. "New uses for data have brought change to companies that produce, package, massage and sell information. We would like to be able to provide a platform that can consume your own data along with those outside sources and aggregated feeds." Combining outside operational feeds with historic patterns and time dimensions, Campbell says, is a big trend in predictive analytics as a way to drive business forward.
Governance and Risk
Opportunities abound, but what lies in the gap is data quality and governance. With a growing abundance of discrete data services on the Web, companies are gathering and using information from many more sources, some trusted and others less so.
Going forward, business will need to add risk management to the governance of data that no longer comes only from internal resources. "We almost had governance roped in and now we have a whole new Wild West and we have to go back to the drawing board in terms of discipline," says Saugatech Technologies VP Mike West. "Is information authentic? Is it valuable? What does a lifecycle of information look like now?"
For many feeds, governance and data quality are already implied by provenance. Legacy providers, Dow Jones, D&B, Equifax et al, have earned roles that were once managed within enterprises by providing information that is current and comes with a service level of authenticity. But even that has raised confusion over who really owns the golden record of corporate trust in data.
Flip a coin, Karel says. "Many organizations use D&B as the record of truth and many others use D&B to validate their own database. Before I was an analyst, I was an end user and I've gone both ways myself."
On the frontier of social and contact media, companies have to manage risk in data they can't control, but still need. It is difficult to merge the idea of "crowd wisdom" with our definitions of data quality, but any educated stock investor would agree that momentum (eyeballs) and fundamentals (quality data) are both valid strategies.
Campbell sees value in "gray" data that needs to be scored. "I know I can't trust information from Wikipedia, but odds are there is some truth there and if I ignore it I'm just hurting myself."
That is ready justification for risk analysis of curves of likelihood for different information scenarios. "The CIA has quality assessments assigned to any information they capture," West says. "'Something' is better than a rumor, so you need to think about a level of quality that creates a trigger for a decision."
There is no doubt that analytics can importantly advance our understanding of at-large data. As information economies develop toward identifiable standards of accuracy, Gardner and many others believe strongly that analytics will be the best answer available to rationalize a sample or universe of data. "Something that is not scientific is nonetheless reflective of a trend or zeitgeist [between] behavior and reality."
Standards for a Data Economy
When more networks of data become economies, companies will look for more metadata to segregate types of information and their relative merits and quality. "No one is suggesting we're going to replace our legacy systems and data farms," Kaplan says. "But you want to ask how effective you have been with your own data silos as well."
This is precisely why specialized service vendors and networks have emerged, since no organization can hope to close all the gaps or process the 15 petabytes of new information IBM estimates are created each day.
It's still garbage in, garbage out. "I hope people are taking a risk approach because, yes, things will get better, which means you'll have to revisit the bets you make internally and with others," Dresner says.
But like the dial tone of a telephone, infrastructure is always receding to a presumption of connectivity and trust. We know standards have arrived when we no longer pay attention to them. The toast always fits into the toaster and the toaster plugs into any wall socket.
When data economies are reality, they will likewise come with assumed connectivity that lets information producers and consumers focus on the data, not the details.
And while standards of dependability continue to mature, the market will set the rules and we'll judge the interim value of data in part by what we are willing to pay for it. Whether it's a pig in a poke or a highly regarded source, there is every reason to expect we will talking more about the value of information and less about the details of infrastructure that support it. The Web as integration platform has arrived and with it, an economy of its own.
(STORY SIDEBARS REFERRED TO IN ARTICLE FOLLOW -ED)
The Communal Data Service
The ability to monetize a data feed can arise from dedicated users with a common interest. A service provider called Jigsaw brings a tribal approach to managing business contacts and prospects, backed by five years of growth and $18 million of venture funding.
"The way we aggregate our database is through community, over a million registered members at Jigsaw," says CEO Jim Fowler. "But we make our money by cleaning databases for enterprises with our data as a service product, DataFusion, where enterprise customers, if they share information, can lower cost further."
The premise of Jigsaw is that it's folly for an individual or an enterprise to maintain a current list of clients and prospects. So individual members trade new update information for new contacts, and enterprises pay by the seat for access to 3.6 million company records and 18 million contact records, a list that grows 25,000 records per day. Data records of enterprise customers are batch cleansed nightly and those that open their own records to Jigsaw lower their monthly $99 per seat cost to $79. Of roughly 100 enterprise customers signed in the last four months, 40 percent share their information with the greater Jigsaw database.