For those who follow the business intelligence (BI) media, the phrases “commercial open source” and “super-crunching” now need little introduction. The successes of Red Hat, JBoss, MySQL and other products have helped educate the technology marketplace on commercial open source. And the current growth trajectories of nascent companies Pentaho and JasperSoft suggest that open source business models translate quite well to BI.

 

At the same time, 2007 bestseller BI books Super Crunchers and Competing on Analytics clearly point to the evolution of a next level of sophistication in BI, a sophistication that includes randomized business experiments, advanced statistical models, machine learning algorithms and other visualization and optimization techniques that top analytics competitors will use to differentiate in the competitive market.1,2

 

Both super-crunching and open source can find a champion in The R Project for Statistical Computing. Certainly among the leaders of open source projects, R is now the standard for academic statistical computing, the preferred platform of statistics, graphics and computation for a worldwide community of enthusiastic contributing statisticians – including myself.

 

Several months ago, I read a press release announcing REvolution Computing’s plans for commercial support and enhancement of R, subsidized in part by venture funding through global investment company Intel Capital. At that time, I thought it might be interesting to get the insight of a venture capital firm on both open source and advanced analytics. After all, we hear quite a bit on the opinions of pundits and software vendors, but what about the perspective of a company that’s providing financial backing for a new business model? I then contacted Patrick Walsh, investment manager for Intel Capital responsible for the REvolution account, who was most cooperative and graciously agreed to be interviewed for DMReview. What follows are the thoughts of Walsh on open source and super-crunching. I hope DMReview readers find them as informative as I did.

 

Steve Miller: Intel Capital defines its mission as a “global commitment aimed at investing and supporting profitable enterprises that will drive internet growth, enable new usage models, and advance industry standards.” Can you elaborate on this?

Patrick Walsh: Intel Capital has actively invested in technology startups since the early 1990’s, and has now grown to become one of the largest technology venture capital firms in the world. We ended 2007 with about $2.5 billion invested in over 400 companies spanning numerous technology segments. In our investments, we strive for both strategic alignment with Intel’s core businesses, as well as financial return to the portfolio.

 

SM: Intel Capital has a unique focus on open source software. Can you give us a history of your involvement with open source and explain the business models through which commercial open source companies will make money?

PW: As an investment manager in the software and solutions sector at Intel Capital, one of my focus areas is early stage open source – seed or Series A investments for businesses using open source software to deliver new value or capabilities to customers.

 

Intel has invested in a number of open source companies in the past, including successful firms such as Red Hat, JBoss, MySQL and Zend. In 2006, Intel Capital created the Open Source Incubator Program to engage at the earliest stages with entrepreneurs forming companies that take advantage of new business models enabled by open source. Many early open source companies focused on providing support, consulting services and prebuilt binaries for commercial users. As these businesses matured, they also began offering enterprise features in their software under commercial licenses to drive increased revenue, improve new subscription (conversion) rates and ensure ongoing renewals by enterprise users.We now see many companies offering subscriptions to new, hosted solutions, as well as access to high-value data services. We believe all software companies must transition to a subscription-based model, and truly successful software companies will make use of, and contribute to, open source projects in the normal course of business.

 

SM: BI solutions are moving from traditional reporting and online analytical processing (OLAP) into analytics, advanced visualization, optimization, predictive modeling and data mining. What is Intel Capital’s commitment to BI and position on this increasing use of super-crunching for business intelligence?

PW: Intel sees the movement from traditional reporting to super-crunching as “BI gets real” - an evolution from simple reports and pretty pictures with little meaning, to deep understanding of business dynamics through the use of math. We actively engage BI vendors to drive the increasing use of mathematics in business. Large-scale data mining consumes CPU cycles and drives users to purchase a range of related solutions (storage, compute clusters, interconnect, etc.) to support the computation efficiently.

 

Intel Capital reinforces this trend through our investments in the BI space. We expect that real-time access to large and growing data sets, coupled with the increasingly complex computations needed for modeling and visualization, will drive all aspects of platform evolution - spanning client, server, network and storage infrastructure. And visualization will assume an enhanced role in helping us understand and interact with these complex models and data sets. In short, Intel Capital thinks the future of BI is the future of the computing industry.

 

SM: Intel Capital recently invested in REvolution Computing, a company focused on using parallel computing techniques for computational statistics. REvolution’s products include RPro and ParallelR, which embellish the open source R Project for Statistical Computing with commercial support and parallelism. Could you share with us your thoughts on how commercial R will be a success in the marketplace? What will REvolution Computing add to the platform? Why would a CIO subscribe to REvolution Computing for software he can download for free?

PW: R is widely used by statisticians in many areas, but is typically not deployed for commercial statistical work because it’s not supported or certified for regulated environments such as Pharma and Financial Services. R is similar to the earlier open source JBoss project in this regard. When JBoss was formed, there was no commercial support offering, and it lacked key certifications needed for enterprise deployment. Yet JBoss was one of the most widely used application servers by developers at the time.

 

R has a large and growing user community and pent-up demand for commercial application support. REvolution Computing was formed to address the need in the marketplace for a certified, supported version of R, as well as a high-capability parallel-computing solution for the most demanding R users. REvolution will offer two initial products – RPro, the core support offering that includes certified binaries for enterprise use, and ParallelR, an enterprise-focused commercial add-on that enables R users to dramatically reduce time-to-solution using medium and large-scale parallel computing platforms.

 

Commercial R will only strengthen R’s position by providing a supported and certified version for use in the enterprise. Indeed, we feel that REvolution Computing will do for R what Red Hat did for Linux. No longer will users moving from university settings to the commercial world have to abandon their preferred statistical platform. And businesses will benefit from the incredible influx of R talent and innovation from top universities and statistical research centers around the world. In addition, there are the usual open source advantages of “many eyeballs” to locate and fix bugs, widely-available, community-supported add-ons and a large and growing user base with immediate knowledge of R. Revolution is forming an advisory board that will include key members of the open source community and putting a manager in place to work directly with the R community.

 

We see the trajectory of R/REvolution Computing as similar to MySQL. Only a tiny fraction of the total user community of R will become customers of REvolution, but this tiny fraction will represent some of the most demanding commercial users who’ll rely on RPro and ParallelR for mission-critical applications. These demands will help progress R into an even better platform for computational statistics. We anticipate the time when both the open source R community and the users of Revolution’s products recognize that each benefits the other in a way that was not possible before.

 

SM: Look into your crystal ball as a venture capitalist. What is the future for business intelligence, analytics/data mining, open source and commercial open source software?

PW: First, BI will continue to involve more and more math on ever-larger data sets, with users demanding deep, real-time analyses, accurate predictive modeling and immersive, collaborative, interactive 3D visualizations of results. On the analytics front, we believe we’ll continue to see end user demand for improved predictions based on large-scale data sets. Traditional reporting is necessary, but, in the absence of predictive modeling and analysis, it can result in information overload and decision bottlenecks. Analytics leads us towards applications that use predictive models effectively, rather than ones that react to reported data. This is much like the difference between weather reporting and weather forecasting – if we could accurately forecast, then the value of reporting would decline. We eventually might not look at the report unless it failed to match the prediction. This is, we feel, where BI analytics needs to go.

 

Second, we expect open source companies to lead the software industry as the general business models for all software firms evolve. The shift will be away from up-front licensing plus maintenance pricing and towards subscription-based revenue models with heavy focus on open source development. With this change will be accompanying emphasis on online marketing/fulfillment, hosted solutions offered as integrated aspects of the service, data-centric services that might rely on free and open source (FOSS) code but deliver, manipulate and generate high-value data, and, finally, increasing use of advertising, primarily in B2C settings to monetize Web/portal traffic. In the long term, we believe all software firms will be essentially open source, relying on some mix of these more advanced models involving high-value services and access to high-value data to generate revenue.

 

References:

 

  1. Ian Ayres. Super Crunchers - Why Thinking-By-Numbers is the New Way to Be Smart. Bantam Books: 2007.
  2. Thomas H. Davenport and Jeanne G. Harris. Competing on Analytics - The New Science of Winning. Harvard Business School Press: 2007.