Columns in the Clouds
Vertica CEO Ralph Breslauer takes analytic databases into the new world of hosted data infrastructure
Information Management Magazine, August 1, 2008
Advertisement Vertica Systems CEO Ralph Breslauer was already in the software business when he left South Africa 20 years ago to pursue a career in the U.S. In stints including major roles at Informix, Ardent Software and eRoom (now a part of EMC), Breslauer always tended to evangelize the practical side of technology. He explains,Despite my computer science degree, Ive always seen myself as that person between technology and getting great value out of it. Now, as CEO of a young and growing upstart in the hot columnar database arena, Breslauer says Verticas modern architecture is ready to fuel grid-enabled computing and launch a major shift in enterprise data warehousing and business intelligence. Breslauer recently sat down with DM Review editorial director Jim Ericson to fill in the blanks. DMR: Vertica is relatively new in the analytic database industry. What was the opportunity you sensed in the market? RB: I tend to look at the analytic database as a three-generational thing. You have also spoken about generations in the past, but Im not sure we see them exactly the same way. To me, the Oracles of the world and IBMs and SQL Servers are the great first-generation products we all use and will continue to use. The second generation is really the Teradatas, the Netezzas and DATAllegros, which established the need for a separate analytic solution. You dont try to do it with the database; you want something different to get the performance you need for analytics. Those guys did a good job of establishing analytics as a need, they got better at doing sequential scans and faster stuff with disk, but it was still the same row-oriented architecture. It still had inherent limitations in areas like concurrency. Then the third generation of players, where we see ourselves as a leader, is really architected from the ground up as a shared-nothing grid architecture, massively parallel and using column architecture rather than row architecture. DMR: We know theres a resurgence in columnar databases, yet products such as Sybase IQ have been around for a long time. RB: Sybase IQ is a great column store, but its now something like 15 years old. When people designed that software they looked at the available machines, and machines have always been input/output bound in analytics and in data warehousing, meaning that the amount of data you bring back from the disk is really your bottleneck. Whats happened since is that central processing units (CPUs) have gotten so much faster while disks have gotten cheaper but not a whole lot faster. Today, you can improve the performance exponentially by having the data where the CPUs are. So instead of running on a bunch of computers with a big stand, we run on a shared-nothing grid. You pop another node in the cluster and you get immediate and substantial additional performance. Were just that much more modern. DMR: So whats the deployment route for your customers? RB: There are three ways people use our software. The first way is, you buy it and put it on your enterprise-standard hardware. Sun, HP or Dell, it doesnt matter. The second option is, hang on, you dont have IT resources, youre a smaller shop that wants to plug and play. We offer a prepackaged appliance to do that, all the benefits of DATAllegro or Netezza but without being proprietary. The third and probably the most exciting deployment is offered on the Amazon Elastic Compute Cloud (EC2). The beauty of that is you can get up and running in half an hour as opposed to procuring hardware or waiting for IT to set things up for you. You can do proof of concepts quickly and easily, you can use it for projects that are limited in time and in the end youre not stuck with all the hardware you had to buy. Its also good for independent software vendors (ISVs) because they can add to the stack on the cloud and have an offering very quickly. DMR: Weve never seen anything jump from hype to near-term reality as fast as cloud computing, but because its happening so quickly, there is still a lack of understanding. RB: I absolutely agree with you. The cloud is in the maximum of the hype cycle and still extremely nascent in terms of actual usage. A lot of people might be dabbling, and there might even be 100,000 customers. But in terms of revenue being generated there, its still early. DMR: You mentioned the Amazon service, but there are different definitions of the cloud. Some people say its anything hosted off the Web in terms of services, software or hardware; others say it is about multitenant infrastructure. How do you define the cloud? RB: I think whats important is that it is a publicly accessible place where you can choose to add tons and tons of additional low-cost hardware to get as much scalability as you require. I think there are only a handful of viable cloud offerings today, but there will be many more to come. Amazon EC2 is a computing cloud. Amazon Simple Storage Service (S3) is a storage cloud. Googles got it, Yahoos got it, probably Microsoft and IBM as well. DMR: Do you see application service providers and software as a service (SaaS) as components of the cloud? RB: Look at the cloud as even lower-level infrastructure than the things weve been talking about for the last five years. Its different from SaaS. You can do SaaS on the cloud, but SaaS is much more about just having Internet access and pay as you go. Companies can host that themselves if they want to. What they lack is the ability to access 100,000 servers, where any one customer can use 10 or 50 or 1,000 in any given week and then scale back and scale up again. Thats what I see the cloud as being - the ability to have this massive amount of computing power you add and subtract as you like. DMR: What is Verticas opportunity over the next two to five years in that regard? RB: Were the first to market and I think weve got a huge lead, but were not making millions of dollars on it. However, I believe its going to do a few things for us. First, it gives a lot of people the opportunity to quickly and easily try our technology, and if they like it they can keep using it on the cloud or they can decide to buy it internally. So the whole proof of concept, which all of us in the analytics space have to address for every deal we get, is very well suited to being done on the cloud. Second, it allows people to do big but temporary projects they couldnt do before. Imagine you have 10 terabytes of data you want to analyze, but its really a three-month competitive pricing analysis. Youre not likely to go spend a million bucks to buy the hardware for that. You could probably do the same project for $30K or $40K on the cloud. You dont buy things outright and you can go month to month; were not asking customers for multiyear commitments. By the way, telecom and financial services presently make up more than half our business. A lot of those companies have [quantitative analysts], a lot of bright people who want to do analytics but dont always have large IT resources. They could be big, long-term customers of the cloud, because it is appealing to them to let someone else handle all the infrastructure and leave them to their work of querying the data. All in all, its a pretty exciting time. 
Vertica Systems
Jim Ericson is editorial director of Information Management (formerly DM Review), a SourceMedia publication. You can reach him at Jim.Ericson@sourcemedia.com.
For more information on related topics, visit the following channels:





