There has been a great deal of interest recently in the concept of the data warehouse appliance and the advantages it brings to organizations trying to juggle the demands for increased performance and reduced TCO against the stark reality of ever-increasing data volumes.

The data warehouse appliance concept is reputed to bring a number of benefits to the customer that cannot be achieved using conventional warehouse technologies. Some of these are genuine benefits that a customer will undoubtedly see (i.e., low TCO, high performance, increased scalability and ease of installation).

However, other reputed virtues of the appliance, as promoted by the emerging appliance vendors, include the benefit of software that is tightly optimized to the specifically tailored hardware as well as the simplicity of a single source for hardware and software.

While promoting these virtues, what these vendors are actually saying is that the genuine benefits can only be accomplished using specialist hardware platforms and that this is the only way to build an appliance. Therein lies the myth.

What these vendors actively avoid talking about is the dreaded "P" word - "proprietary" - when mentioning hardware. But that is exactly what these solutions are. Much is made of "commodity components" and this is largely true, but using these components to build platforms tailored to suit one application significantly limits the openness. Platform support and maintenance is only possible from the vendor; the opportunity to reuse or deploy the platform for other purposes, thus protecting investment and allowing for a more dynamic business, is significantly limited. In this day and age single-use proprietary hardware is not desirable.

The myth seems to be spreading (without mention of the "P" word) that this is the only way to assemble a low-cost, high performance data warehouse solution. But there's another way!

Some companies are now providing software that is being used to build data warehouse appliances on arrays of low cost, commodity, off-the-shelf, servers. The resulting solutions produce equal, if not better, performance and scalability and can still be supplied and configured as an appliance.

While these company may have very different approaches, they are built on platforms that can be configured to do other jobs if required. The platform is not tied to one function - it is in effect a "virtual appliance." Employing the often-used kitchen appliance analogy, the virtual appliance allows the dishwasher to become a washing machine or a microwave simply by changing the software. Probably more importantly, it allows the dishwasher to be upgraded by stealing (or borrowing) some hardware from the under-utilized toaster. Fixed "lifetime" devices may be OK in the kitchen but not in the data center.

Let's be clear about the differences. The distinction between "commodity components" and "commodity servers" is extremely important. Commodity components (e.g., processors and memory) are the basic building blocks of virtually all hardware, proprietary or not, while commodity servers are general-purpose servers supplied by vendors such as HP, IBM and Dell.

The vendors of proprietary hardware in the data appliance space claim that the use of "commodity components" will allow their platforms to ride the technology curve. Designing these components into the hardware is not enough to stop platforms from becoming quickly outdated and expensive. Technologies simply move too quickly, and the development costs are unsustainably high. The safest way of ensuring that a smooth platform upgrade path is always available is to adopt a commodity platform from the major hardware vendors.

The so-called benefit of closely coupled hardware and software also merits careful consideration. It was true, a few years ago, that to get the best out of any hardware the software needed to be very tightly coupled to the platform. Getting the highest levels of optimization required custom device drivers and operating system kernel software that employed lots of clever performance techniques that were just not available in standard operating systems. But then along came Linux. Today these techniques, such as "zero copy drivers," can be found in standard Linux implementations and it is certainly possible, now, to process data as fast as a disk drive can deliver it using Linux and commodity servers.

Finally, the proposition of a single source for hardware and software is promoted as a virtue of the data appliance solution. One purchase order rather than two and one source of support appear to be genuine benefits, but these are far outweighed by the disadvantages associated with proprietary hardware. Even the benefit of a single source of support is very tenuous. Vendors of the software only appliances will typically, provide first line support for their software that includes the diagnosis of hardware faults. The customer is informed of exactly what hardware requires attention so that it can be easily and quickly dealt with by the hardware vendors global support network.

The data warehouse appliance concept has many benefits to the customer, but it is important to remember that although some appliances require proprietary hardware, there is an alternative. Low TCO, linearly scalable, high-performance warehouses are now being built using low cost, industry-standard, commodity servers.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access