How do you feel about the Hortonworks spinoff after being so heavily invested at Yahoo all these years?
The history is that we started working on Hadoop at Yahoo five or six years ago when it was just a prototype and 20 nodes. We built out the team and we've been focused on driving it forward for the last six years. Yahoo has built all the releases, has been the majority contributor to all the releases of Hadoop so as a team we're used to supporting a wider community anyway. The difference is, of course, we're now going to be supporting Yahoo explicitly as a customer. The key takeaways are that Hortonworks is an independent company and Yahoo is an investor, a customer and a development partner. Yahoo is maintaining a deep bench of folks who have contributed to Hadoop and who have built applications on top of it. We have more than 1,000 active users of Hadoop at Yahoo.
How arm’s length is Yahoo now as a customer?
We are providing them with Tier 3 support so Yahoo will take the ball for developer training and simple questions and even bugs that are resolvable by relatively new developers to Hadoop. We'll be backstopping them for escalations, and if they discover interesting problems that they can't fix, we'll do that.
So they are across the fence and on their own projects but will offset their costs of developing Hadoop through your work and revenues?
Yes, of course. One of the key reasons we chose to develop our big data platform in open source was the belief that over time an ecosystem would grow from that work – and that would allow Yahoo to benefit from a wider community's investment in that platform. So this is a homerun from Yahoo's perspective. They have gotten it to the point where the press is interested in Hadoop and it has got wide scale adoption in thousands of companies, or in departments in thousands of companies. As a result of that there is an opportunity for an independent company to take on the sort of key role of driving the technology forward and implementing new features and technologies around Hadoop.
You don’t have plans for an enterprise edition or “freemium” software, so what’s the business model?
First, we are committed to Apache and to open source and we think there should be a version of Hadoop that is downloadable from Apache that is complete. Our short term business model is in training and support, and strategic partners such as Yahoo, who have enough interest in seeing the technology continue to evolve in certain directions that they are willing to pay a premium to have us design and develop with them.
Is that enough of a model from a venture capital view?
Well our two investors are Yahoo and Benchmark Capital. Rob Bearden joined us from Benchmark where he was a venture partner to be COO and president, so he certainly believes this is a next big opportunity in enterprise software. We're serious when we say we believe half the world's data is going to be in Hadoop inside five years. That's the scale of opportunity we think it represents, it's going to be a huge ecosystem and we think our rent on that can be significant. The training and support will grow a significant and healthy business and that we will focus on for the short term because it's absolutely critical that we coalesce the ecosystem around the open source product and don't experience a splintering like we saw with Unix.
And the model could change down the road?
Sure, but the thing that won't change down the road is that we'll believe Hadoop and its companion projects should be a complete horizontal layer that is deployable and solves business problems. Our focus in the short term is just growing the market by making it much easier for enterprises to install and use Hadoop and making it much easier for third parties to build businesses, software businesses, OEM businesses or integration businesses around Hadoop. We think, because of our deep technical expertise, we can help bridge that gap and there's a big opportunity to do that while keeping the core free. We are committed to do that, which doesn't mean we won't potentially build products on top of Hadoop ourselves at a later date or do other things to drive monetization. But the opportunity is large, we're well funded and in a good spot to validate Hadoop, which is our mission.
There are funded Hadoop startups and businesses like DataMirror, Cloudera and MapR out there. Some use Apache, some don't, what is the Hortonworks effect on all that?
It's hard to say obviously, but we believe the great thing about open source is that it lets you partner widely. Any of those companies that commit to using Apache Hadoop and putting their improvements into Apache Hadoop, we're committed to partner with those folks. Our job is to make the pie grow bigger.
If Apache is the biggest Hadoop distribution, is it important that it be the successful one and should there be room for multiple distributions and variants?
In any healthy ecosystem there are variants, so we just want to make sure everyone knows everyone can go to Apache for a great version of Hadoop. Right now there is still some confusion and it does take a real expert to install and use Hadoop today and so you want to make it easier.