For companies and government entities that create, store and manage huge volumes of transactional data, ensuring timely access to that data is a daunting and costly challenge. Furthermore, data retention, archiving and compliance requirements are at the top of the CIO’s list of priorities. The current approach is to use a relational database or data warehouse for the retention of historical transaction data. But as the data volumes increase, so do the associated software, hardware and personnel costs for storing that data in a way that guarantees it will be easy to retrieve when it is needed.

Security and compliance regulations demand the perfect recall of events spanning many years from corporations and government agencies that generate vast volumes of transactions on a daily basis. The corporate memory consists of both unstructured data such as text, semistructured such as email, along with structured event and reference data typically stored within relational databases. All three categories of data continue to grow at enormous rates, and most of that data remains unchanged once created.

Indeed, governance and accurate memory requires that events be recalled as they were at the point in time they occurred. The burgeoning corporate memory is typically stored in one or more sizable database management systems consuming a disproportionate share of the IT budget including hardware, software and operational costs.

An industry report I worked on recently noted that 85 percent of production data stored in database management systems is inactive and suggested that information and knowledge management professionals should devise a strategy that moves such inactive data to a lower-cost alternative. I argue that you do not need to incur the cost and complexity associated with the transaction integrity of a relational database system when the data is inactive. I suggest that several immediate benefits can be achieved by moving the data out of the database onto lower cost, highly scalable storage – all without sacrificing access. Indexes can be maintained against transactions stored on an unstructured but highly scalable file system in order to recall precisely what you need within a few seconds from tens or hundreds of billions of rows of data.

What I’m suggesting is a virtual data storage and retrieval option, or, put another way, a technology where data files can be located anywhere on the network. Where the technology offers a seamless integration with existing applications and databases, no proprietary hardware is required. Excellent performance is achieved with low-cost secure data retrieval servers. These secure data retrieval servers deliver instant access to an unlimited history of transaction data - whether seconds or decades old - for the purpose of fast selective query access and retrieval. Specifically, these types of solutions raise the bar on what companies should be able to expect from search and archive offerings because it brings an unprecedented and otherwise cost-prohibitive level of speed and simplicity to the process of retrieving a specific record, or groups of records, from a file system.

Moreover, such an indexed store can offer an additional level of business intelligence (BI). Most companies store summary-level data in their data warehouse for use in analytics. For instance, a retailer may store the total amount spent by an individual at a store over a period of time, or even the amount that the individual spent at each visit over several months or years. But typically, the detail of the specific items purchased at each visit by an individual is lost. Most companies are forced to discard that level of detail because the volumes involved make it too costly to justify keeping every transaction within the data warehouse.

With this said, there is technology out there that complements data warehouses and the BI approach. For instance, a marketing analysis of the data warehouse that identifies useful information can then subsequently dig down and through the more granular level of detail which is now stored in the file system. Every detail is immediately accessible in the file system in order to get the specific level of intelligence needed to create a unique and targeted promotion intended to drive an increase in revenue.

This technology offering brings the best of both worlds – the precise and immediate access you would get from storing the data in the high cost and complex relational database along with the scalability and low-cost profile you get from storing the data in a file system.

More than ever, IT organizations are chartered to align data practices with business goals, specifically, through more intelligent applications of technology to improve customer service and satisfaction, develop new products and services, build brand value and drive competitive advantage. With IT budgets being reduced each year, it is imperative to find new and better ways to support the IT infrastructure so more resources can be made available for innovation to support new business imperatives. One straightforward approach is to reduce the amount of capital being consumed by the relational databases and associated hardware simply by offloading much of the inactive data to a lower cost, highly scalable storage environment, without sacrificing access to the data.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access