Judging from recent events and articles, data warehouse appliances have come into their own. Netezza had a very successful IPO in July and new companies are emerging. Vendors like HP and IBM have come out with new offerings.
Why all this interest and activity now? Quite possibly it's because the mainstream business community has reached the data threshold and is looking hard for data management solutions in the multi-terabyte range. More and more, appliances are on the IT expenditure short list.
Just a few years ago, multi-terabyte data warehouses were few and far between. Organizations are voluntarily collecting more data than ever because they know they can use it to guide their business. Examples of this category include retail and e-commerce and their collections of clickstream data, customer information and transaction details. Any organization impacted by data retention regulations or reporting and auditing mandates is required to accumulate larger amounts of data. Almost everywhere you look, organizations have to, want to, or plan to capture, retain and eventually use vast amounts of data.
The data warehouse appliance was introduced specifically to address the needs of the "big data" vanguards. At that time, the wherewithal to amass and analyze large databases was in the hands of a few technologically sophisticated companies who sought to simplify their data warehousing infrastructure. The appliance approach relieved IT of having to build their own infrastructure out of a mix of iron, wiring and hand-coded software modules. If the infrastructure could be streamlined, more time could be spent on the data and information side of the equation, that is, the parts that brought tangible value to the business.
The data warehouse appliance achieved that by packaging together everything needed to build a data warehouse. Its goal was to deliver a "data warehouse in a box." The early ones were very powerful and expensive boxes that called for significant investment and expertise to implement successfully. Figure 1 below shows how the appliance simplifies the data warehouse infrastructure by reducing the number of components, vendors and connections that need to be managed by IT staff.

How Do You Know an Appliance When You See One?
These early appliances, or data warehouse boxes, fulfilled the strict definition of an appliance: they were built to a specific purpose. Implied in this definition, though, and certainly an important part of how we understand the term "appliance," is the sense that the appliance makes the task easier to perform. For example, we expect an appliance designed to grill slices of bread to make it easier. And if the bread browns more evenly, that's all the better.
An even better analogy to the data warehouse appliance is TiVo. The TiVo device is a package of very sophisticated hardware, connectivity and software. But, it comes as a box with straightforward cables and is pretty close to "plug and play." Consumers don't know that there's a hard drive or Linux operating system in there. And frankly, they shouldn't have to. They bought TiVo so they wouldn't have to miss their favorite shows. TiVo's approach means that they get to spend their time personalizing the offering to match their objectives instead of configuring and tweaking some low-level parameters.
Data warehouse appliances are far simpler to install and maintain than a typical database and storage infrastructure. They're easier to get up and running than a custom-built one. But is that enough? Where do they fall on the TiVo spectrum? How hard is it to personalize them?
Personalizing Infrastructure
Why talk about personalization in the context of data warehousing? Why add personalization to the list of desired characteristics for a data warehouse appliance? I believe personalization is the element that brings the data warehouse appliance to the next level of usefulness or relevance to the business drivers that are behind current data warehouse funding. The current state of data warehouse appliances sees them being very good at performing the operations they were designed and implemented for. There's one problem - things change.
A business buys a data warehouse appliance. That appliance's design has been based on assumptions about how a data warehouse looks and is used. A business chooses the appliance that best matches their expectations of a data warehouse. Then a business spends a lot of time and resources implementing (configuring, tuning) the data warehouse, often reshaping how entire departments touch data. Then things change - a new regulation, a new payment model, a merger or acquisition. Data warehouse infrastructure can change, too - but with enormous overhead and disruption.
Today, changing infrastructure to accommodate business shift is cumbersome and personalization is near impossible. By personalization I mean continually adjusting aspects of the appliance to suit an organization's needs. I believe that the market will soon demand the ability to personalize infrastructure as customers start expecting their data warehouses to contribute to their agility instead of defining its limits. Greater flexibility will translate into more demands for scalability, accommodating both complex analytics and routine reporting with everything in between, handling more users wanting to do a greater variety of things with more data. And, businesses will want to do this in ways that are unique to their enterprises because therein lies their competitive advantage.
What's your Infrastructure's TiVo Rating?
Understanding how close your data warehouse infrastructure is to a TiVo system will allow you to anticipate the requirements your business users will soon bring. The closer you are to matching TiVo, the more effectively you'll be able to respond.

To profile your infrastructure's flexibility, consider all the components that have to be touched when you make significant additions to users, data volume, or dependent applications:
- How many physical parts are involved? Include hardware, cables, power packs, etc.









