Owning data costs businesses money. Creating the data costs money. Storing the data costs money. Retrieving the data costs money.

What is the hottest trend in information technology (apologies to Web zealots)? The competitive pressure driving businesses to organize their data into "internal reference libraries," known as data warehouses.

Extracting, scrubbing, conditioning, merging, purging, cross referencing, and validating all that data costs even more money. Nevertheless, data that is coherently organized and easily accessed from the desktop can immediately lead to decisions that are more timely, more informed and more precise.

Data can now follow the equation to profitability:
Data - Information - Knowledge - Better Decisions.
As this happens, possession of data changes from a cost into an asset.

What can business do to ensure that their use of data follows this path? Data mining technologies should be applied throughout the process to guarantee success.

Data Warehouse Explosion

Data warehousing is experiencing explosive growth. Currently a $5 billion market, data warehousing is forecast by META Group to grow into a $20 billion-dollar industry within three years.

Ironically, this growth is attributed, in part, to the tendency for many businesses to view data warehouses as an end-solution in themselves.

The president of Black & Decker in Canada once made the pithy observation that "customers don't want drills, they want holes." The "drill," in our case, is the data warehouse, and the "holes" we're trying to make are better decisions.

But a data warehouse is only as good as the business objective that led to its creation. In the early days of data warehousing, systems were set up to pursue well-defined business objectives. Recently, in the stampede to warehouse every free-floating bit and byte, we have forgotten that the original purpose of data warehousing is to aid business decision making.

If we cannot answer the question "What business problem does this warehouse solve? (What holes does this drill make?)" we are doomed.

When I was involved in the early days of data warehousing, we would write business cases by interviewing executives and asking "What would be the value to your business of improved decision making, using the reports you have always wanted, but cannot get now?"

We would take the answer, divide by ten, add it to the other answers from the other executives and still get a staggeringly large number. In fact, I cannot recall ever getting a number less than $100 million. Everyone was flying blind--and knew it.

Data Mining to Ensure Success

Data mining can play two roles in this process. First, data mining can help assess all the usefulness of raw data for various business objectives without going through all the steps of the warehouse process.

Second, a data mining application can superimpose a carefully crafted "bit" over the data warehouse "drill." By making sense out of the relationships and patterns inherent in the data, the data mining application transforms the data warehouse into the decision-making tool it was intended to be.

Data mining will grow explosively. Currently a $50-$100 million market, data mining is forecast by META Group to grow into a billion-dollar industry within three years.

Data mining tools, whether based on decision-tree, neural-net algorithms or family clusters, function to illuminate patterns in data that no business has the time, energy or computer power to find on its own.

Data mining has proven itself immensely useful in a number of fields, including detecting auto insurance fraud, diagnosing failures in manufacturing processes, market-basket analysis, market segmentation and assessing risk patterns in consumer credit loans.

A simple example of data mining's usefulness is the practice of looking for market segments. In any given business there may be anywhere from 20 to 500 market segments, all of which can be flushed out by analyzing the underlying patterns of preference and behavior over time for one product or for an entire line of products. A company that offers 100 products may discover, through data mining, a mosaic of overlapping market segments, which the company can then validate through marketing campaigns, promotions, pricing or product bundling schemes. All data gathered through the validation process then goes back into the database, thereby notching its decision-making value for the company to a higher level.

As the segments are observed over time, predictions can be made, tested and improved upon. In the medical field, a stunning example of data mining's value was revealed by the Oxford Transplant Center in Oxford, England. Using ANGOSS KnowledgeSEEKER, a data mining query tool, two doctors (neither of whom were trained statisticians) made several startling discoveries.

They learned that elderly patients receiving transplants did not, in fact, experience less stress if they were operated on at a slower rate than younger patients. They learned that certain drugs given to the elderly to prevent organ rejection were doing more harm than good. And they discovered a correlation between a successful transplant and how many minutes the donated organ sat at room temperature before the operation. Each discovery, achieved through the analysis of mined data, challenged long-held assumptions and effected major changes in the clinic's medical procedures.

Data Mining as a Springboard

Perhaps data mining's most useful role is just beginning to be explored. Visionary companies are finding that data mining is part of a holistic approach to customer/product management. These visionaries view data mining as a springboard to achieving a "whole business solution."

A company skilled in mining its data will first integrate the mining tool with its database, enabling it to keep track of its data over time, thereby learning from it and evolving that knowledge as trends and the marketplace change. Secondly, the embedded tool, scripted to search out particular relationships or patterns, evolves a set of templates that can be used to look for other types of insights.

These scripts are on the cutting edge of dating mining now, but within a year they will be commonly available through leading data mining applications. An example would be a script that could mine a quarter of a year's banking data, involving up to a billion transactions, ferreting out customer preferences, risk patterns and market segments--all in real time.

Pooling Mined Data

Once a company has achieved scripting and a sophisticated level of data-supported decision-making, the next step is to pool this data--data which now has tremendous value not only to that company but to others who are looking for ways to improve their decision-making capabilities.

Pooling data is a way of achieving validation, as well as extending the predictive abilities of data. Pooling circumvents the sampling conundrum. With today's computer systems, it is impractical to look at all the data you have, so you sample--to test whether you are truly working with a representative customer base and to test whether your customer base represents the world market. But samples, by definition, are small. Pooling data among companies allows for sampling on a much larger, more useful scale.

As the business culture begins to evolve from thinking of data as a cost to data as an asset, companies will want to pool data out of self-interest. Well-documented examples are American airline companies that pool their reservation database system, credit bureaus that share data for the economic value of obtaining a wider look and medical institutions that pool in order to make more reliable predictions.

The practice of data mining will lead us to a paradigm shift. Instead of viewing each company's data as "different" and, therefore, not useful to other companies, executives will recognize that mined data possesses a decision-making value that is marketable.

Company executives will also begin to ask themselves, "My data has led me to a solution, but is my solution as good as it can be? If I could look at more data, would I be able to make an even better decision?"

In terms of data, it is an axiom that if you have a little and then get a little more, it's an improvement. If you get a lot more, it's a profit. It's turning that straw into gold!

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access