For nearly every major industry, quality, real-time decision making has become a needed-to-play feature of successful companies. Data and related applications have been the primary catalysts in making real-time decision making a reality.
Consider the way leading financial services and insurance companies bring together historical and external data to instantly assess the lifetime value of customers, the way leading telecommunications companies use call detail records for evaluating proposed billing changes or the way leading retailers use customer and inventory data to more effectively market, cross-sell and keep their shelves stocked with the items their customers want. The list goes on.
Recently, the stakes have become higher because data storage is now inexpensive enough for companies to consider keeping all of their data forever. The assumption is that at opportune moments, deep historical data can yield business insights and other advantages never before imagined.
However, if companies fail to think carefully about their data storage choices, a loss in computing performance could undermine their ability to realize real-time decision making for both tactical and strategic initiatives. Initiatives may founder, either because decisions were not timely or because executives believed they had a complete picture of the business challenge and the company's capabilities when, in fact, they did not. Even if an initiative staggers along, the need to constantly dedicate manpower to tune a system in which the storage capacity does not match the company's business needs can be costly and frustrating.
Mistaken Concentration on Cost Per Megabyte
With data doubling every eight months (thanks to such things as call detail records, transactions and Web clickstreams), companies with data warehouses are understandably searching for the least expensive storage options by comparing price per megabyte. Vendors tend to encourage this approach by offering drives with huge storage capacities to cope with the increased data requirements.
Yet if I/O speed is fixed or if the I/O speed does not advance as rapidly as storage capacity, which is often the case with today's technology increasing the storage capacity per disk may make it more difficult for companies to rapidly access the information they so desperately need for real-time decision making. In such cases, the dollars saved up front by choosing a larger storage drive can be lost because a new multimillion dollar initiative that depended on rapid access to that data could not deliver as promised.
Of course, regardless of disk size, there are things companies can and should do to try to address the performance issues. Partitioning, compression, priority scheduling, query optimization and advanced indexing can all help. However, these techniques become even more effective when combined with storage capacity that is matched to a company's strategic goals, its computing power and the nature or "temperature" of its data.
The Multitemperature Data Warehouse
Understanding data warehouse storage begins with the concept of a multitemperature data warehouse. In nearly any warehouse, data has a variety of temperatures. Hot data tends to be more recent, frequently used data accessed by multiple users and multiple applications for multiple queries.
As data cools, the demands decrease. This cooler data tends to be historical and tends to occupy much larger volumes in the warehouse than warm and hot data.
Yet even cool data occasionally flares hot. Consider, for example, how a health insurer may eventually need to produce many years' worth of records that demonstrate how the company has protected patient privacy.
Capacity Planning that Takes Data's Temperature
With an understanding of multitemperature data, companies can conduct a capacity planning process that gauges the aggregate temperature of the data, system capacity and system performance needs and capabilities. This requires that business users work closely with IT people to ensure that the IT staff understands how the various data is likely to be used.
The first step is to classify the data by access frequency and data volume. Companies can start by classifying data as being primarily tactical, current decision support or historical decision support and, ultimately, determining the temperature of each class. (A formula for measuring data temperature might include the performance demand for queries, updates and data maintenance.)
Next, the company determines the capacity and performance requirements for each type of data, which it then uses to create system requirements. In the case of a smaller warehouse dominated by hot and warm temperature data, the goal is a price/performance tradeoff that enables the system to consistently deliver real- time intelligence for important decision making. In most cases, the company would use the smallest available disks (such as 36GB, 15K RPM disks) and RAID-1 redundancy.
In contrast, warehouses with a high proportion of cool temperature data need less high-powered performance from their data, so companies can opt for more per unit capacity (73GB or 146GB disks). In either case or in a warehouse where the data temperature is more balanced determining the temperature of each type of data and finding an aggregate is critical in the choice of storage disk size.
Be a Data Weatherperson
The bottom line is straightforward: The value of data in a warehouse lies in its promise of exemplary decision support, particularly for the real-time decision making that is so central to today's business environment. A partnership between business and technical people and a strong understanding of the concept of a multitemperature warehouse are essential if the data warehouse is to deliver on its promise.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access