"I need an itemized list of claims records of every individual in Suffolk County for the past year."

The chief actuary was seeking support for his theory about the recent upsurge in fraudulent claims activity in the Northeast. The model was developed, and he needed a sufficient statistical sample of claims data to test it. Fortunately, the data warehouse was not summarized well beyond the individual claims records, which would have left the operational system as the system of record for such detail ­ the system whose too small batch query windows were the reason for the data warehouse in the first place.

Fed with empirical data, the theory can be verified, possibly operating to screen out millions in losses each week.

"How do I get a list of our customers who have purchased diapers from Store 123 in the past three years?"

The marketing director of a retail chain logged this request to the data warehouse help desk. A diaper vendor wished to do a targeted promotion which could raise single-store traffic. The fortunate reply was the data warehouse captured the detail data in addition to the customer-week, store-week and product-week summary levels.

If it did not, and the data was not granular enough to suit the requirements for specific promotion targeting, another store chain would have received the vendor's business.

"I need to examine the rate of dropped calls along I-95 from Philadelphia to Baltimore by caller, during peak rush hours for the last six months. We think that sector may be contributing to our increasing customer loss rate year-to-date."

The wireless competitor suspected ­ but couldn't prove ­ that transmission quality, not price discounting, was driving its customers away. But could it be something else? Repairing a tower was costly compared to a targeted business win-back campaign. But could they prove a correlation?

Fortunately the data warehouse was designed for more than analysis of invoice-ready bill details and monthly billing summaries 30 days after the fact. Dropped calls were not lumped into miscellaneous credits.

When you need to drill into customer behavior in detail, over long time horizons, then you're grateful the data warehouse project budgeted for extra accessible storage and incurred the extra effort to bring high-volume customer data at the transaction-item level into the warehouse. This data is called atomic data ­ the details of each transaction at each touchpoint.

Atomic data is the most granular data possible. Numerous problem-oriented data marts can build on this grain to grow the enterprise's use of the data. Rather than limiting your queries to summary data only, atomic data provides maximum flexibility for the CRM-ready data warehouse. The ability to build a fantastic summarization scheme into your warehouse, and also get to the details when necessary, all begins with storing the atomic data.

Separating Hub from Spoke

Today, there are interesting technologies that place the atomic data requirement square in their sights by optimizing the fact table's storage component across multiple storage management devices. Consider FileTek's StorHouse/RM product, which manages a whole set of tiered storage products, including RAID and robotic libraries for optical storage and tapes ­ from small tabletop libraries to the largest data center silos. StorHouse provides direct relational access to all these media, while the high density and rapid access of today's drives serves queries against the long, linear detail tables that can occupy 75 percent or more of a CRM-ready data warehouse's volume. StorHouse repositories act like any standard data source in conjunction with common data warehouse tools.

The atomic details enter from an integrated detail source ­ the data warehouse staging area or the output from an IP mediation job ­ and collect in the StorHouse repository. From there, any data mart can then create the dimensions and facts of its "spoke" directly from the atomic detail.

Technologies such as this may not be for the vanilla warehouse and do not replace the need for scalable architecture and processes. Their best use will be with data warehouses that are transcending current paradigms into the multiterabyte range of storage.

In telecommunications carriers alone, the transition from a traditional, switched transport network to an IP-centric "next generation" network will expand the detail data generated approximately twentyfold according to IP-mediation leader XACCT Technologies. E-commerce technology will allow collection of more customer behavior detail at a lower collection price than ever before. It won't just be the Fortune 100 in this market, but the entire Global 2000 plus new entrants who don't even exist today.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access