JUN 1, 2005 1:00am ET

Related Links

Visiting Nurse Service Cares About Cloud Security
October 25, 2011
Light at the End of the Silo
October 28, 2010
Pitney Bowes Releases Enhancements to MapInfo Professional
September 13, 2010

Web Seminars

Data Replication for Real-time (Big) Data Warehousing
Available On Demand
Improving your Overall Analytical Environment by Migrating to a New Data Warehouse Platform
Available On Demand
The Dynamic Duo of Data Warehousing and Real-Time Streams
Available On Demand

The Data Warehouse Appliance Myth

Print
Reprints
Email

There has been a great deal of interest recently in the concept of the data warehouse appliance and the advantages it brings to organizations trying to juggle the demands for increased performance and reduced TCO against the stark reality of ever-increasing data volumes.

The data warehouse appliance concept is reputed to bring a number of benefits to the customer that cannot be achieved using conventional warehouse technologies. Some of these are genuine benefits that a customer will undoubtedly see (i.e., low TCO, high performance, increased scalability and ease of installation).

However, other reputed virtues of the appliance, as promoted by the emerging appliance vendors, include the benefit of software that is tightly optimized to the specifically tailored hardware as well as the simplicity of a single source for hardware and software.

While promoting these virtues, what these vendors are actually saying is that the genuine benefits can only be accomplished using specialist hardware platforms and that this is the only way to build an appliance. Therein lies the myth.

What these vendors actively avoid talking about is the dreaded "P" word - "proprietary" - when mentioning hardware. But that is exactly what these solutions are. Much is made of "commodity components" and this is largely true, but using these components to build platforms tailored to suit one application significantly limits the openness. Platform support and maintenance is only possible from the vendor; the opportunity to reuse or deploy the platform for other purposes, thus protecting investment and allowing for a more dynamic business, is significantly limited. In this day and age single-use proprietary hardware is not desirable.

The myth seems to be spreading (without mention of the "P" word) that this is the only way to assemble a low-cost, high performance data warehouse solution. But there's another way!

Some companies are now providing software that is being used to build data warehouse appliances on arrays of low cost, commodity, off-the-shelf, servers. The resulting solutions produce equal, if not better, performance and scalability and can still be supplied and configured as an appliance.

While these company may have very different approaches, they are built on platforms that can be configured to do other jobs if required. The platform is not tied to one function - it is in effect a "virtual appliance." Employing the often-used kitchen appliance analogy, the virtual appliance allows the dishwasher to become a washing machine or a microwave simply by changing the software. Probably more importantly, it allows the dishwasher to be upgraded by stealing (or borrowing) some hardware from the under-utilized toaster. Fixed "lifetime" devices may be OK in the kitchen but not in the data center.

Let's be clear about the differences. The distinction between "commodity components" and "commodity servers" is extremely important. Commodity components (e.g., processors and memory) are the basic building blocks of virtually all hardware, proprietary or not, while commodity servers are general-purpose servers supplied by vendors such as HP, IBM and Dell.

The vendors of proprietary hardware in the data appliance space claim that the use of "commodity components" will allow their platforms to ride the technology curve. Designing these components into the hardware is not enough to stop platforms from becoming quickly outdated and expensive. Technologies simply move too quickly, and the development costs are unsustainably high. The safest way of ensuring that a smooth platform upgrade path is always available is to adopt a commodity platform from the major hardware vendors.

The so-called benefit of closely coupled hardware and software also merits careful consideration. It was true, a few years ago, that to get the best out of any hardware the software needed to be very tightly coupled to the platform. Getting the highest levels of optimization required custom device drivers and operating system kernel software that employed lots of clever performance techniques that were just not available in standard operating systems. But then along came Linux. Today these techniques, such as "zero copy drivers," can be found in standard Linux implementations and it is certainly possible, now, to process data as fast as a disk drive can deliver it using Linux and commodity servers.

Finally, the proposition of a single source for hardware and software is promoted as a virtue of the data appliance solution. One purchase order rather than two and one source of support appear to be genuine benefits, but these are far outweighed by the disadvantages associated with proprietary hardware. Even the benefit of a single source of support is very tenuous. Vendors of the software only appliances will typically, provide first line support for their software that includes the diagnosis of hardware faults. The customer is informed of exactly what hardware requires attention so that it can be easily and quickly dealt with by the hardware vendors global support network.

The data warehouse appliance concept has many benefits to the customer, but it is important to remember that although some appliances require proprietary hardware, there is an alternative. Low TCO, linearly scalable, high-performance warehouses are now being built using low cost, industry-standard, commodity servers.

Roger Gaskell is director of product development for Kognitio. Kognitio was formed in August 2005 with the merger of Kognitio and WhiteCross. Gaskell joined WhiteCross in 1988 and has overall responsibility for product development. He has been responsible for the development of WhiteCross's data appliance technology, evolving WhiteCross from a proprietary hardware appliance to a software-only virtual appliance built on industry-standard servers. He may be reached at roger.gaskell@kognitio.com.

Filed under:

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.