With all the success Norman Nie has had in his 40+ year career in academics and business, you'd think he'd be enjoying a little break at this point. Rather than taking it easy, however, Nie is embarking on yet another big-bet business venture.
After completing his Ph.D in political science from Stanford University in the 60s, Nie had a highly successful academic run on the faculty at the University of Chicago before returning to Stanford 12 years ago. Along the way, he held prestigious Woodrow Wilson and Fulbright fellowships. Among Nie's prolific list of scholarly works are two books that won Woodrow Wilson Foundation awards for best political science publications of 1976 and 1996.
But Nie is perhaps best known as co-founder, then President, and finally CEO of SPSS, the Statistical Package for the Social Sciences, a statistics/analytics software company started in 1975 that grew to a successful public offering in 1993. Originally developed to analyze data for Nie’s dissertation, SPSS caught on in a big way among other social scientists and researchers, in its wake revolutionizing empirical social science. SPSS predated current statistics software leader SAS by a few years, successfully transitioning from academic roots to business applications with strong Windows releases. I started using SPSS in 1976 and well remember the big purple user guide that introduced me to SPSS and Norman Nie. Over the years, SPSS crafted a well-deserved reputation for its easy-to-use interface that made predictive models accessible to non-statisticians.
With the 1993 IPO and the 2009 sale of SPSS to IBM for $1.2B, Nie is no doubt financially secure, so I was intrigued when I read he'd become CEO of REvolution Computing, the commercial open source vendor of the R Project for Statistical Computing. An unabashed fan of R, I was pleased when Norman graciously agreed to a series of interviews on analytics and statistical software in general and Revolution's strategy in the exploding market.
This week's Part 1 gives Nie's take on the evolution of statistical software. Part 2 will outline the strategy of REvolution Computing to make R more suitable for a business/BI portfolio, in so doing providing a foundation for the platform to compete with current proprietary leaders SAS and IBM SPSS.
Experts vs. Analytics – your assessment from 40+ years in the business.
I think you're talking about the degree to which decision making in enterprise and scientific research is based on expert opinion versus empirical information and data collection. Over the past 40 years, we have seen a consensus emerge that to ensure efficient operations in enterprise, empirical data must be applied to virtually all fields.
If you look at the last 40 years of university curriculum, SPSS – the product I helped build – has been the dominant player, even becoming the common thread uniting a diverse range of disciplines, which have in turn been applied to business. Data is ubiquitous: tools and data warehouses allow you to query a given set of data repeatedly. R does these things better than the alternatives out there; it is indeed the wave of the future.
You built a highly-successful statistical analysis software company, SPSS, over 40 years ago, and have enjoyed great success in both business and academia during that period. Now, much more than financially secure, what motivates you to take the helm of commercial open source R, REvolution Computing?
REvolution offers an extraordinary opportunity to remake and reenergize an exploding field in predictive analytics. According to analysts, the predictive analytics market is set to experience double-digit growth; to be able to do something like this twice in one's lifetime is a feat motivating in its own right.
What lessons, if any, from the successes and failures of SPSS can be applied to R and Revolution Computing?
My partners and I designed SPSS in an era when being a programmer was a very rare talent. Data was only being specially designed for a given project, volume was limited, and the number of procedures that people would commit to were limited and unrelated. For years, I urged SPSS to re-do their system to deal with the larger amounts of data that were flooding them.
When the opportunity of R and REvolution came to me, it was obvious that I had found the right tool to finally accomplish this feat: a fully programmable statistical language that would allow us to take the next great leap forward. Now, we are putting analytical capabilities in the hands of the individual rather than the organization as a whole. This way, the product is customizable and allows experts to better tailor programs to their specific needs.
SPSS has a proprietary software model, while R is open source, with REvolution Computing the commercial arm. What benefits do you see to the open source model of R vs SPSS? How about the converse, advantages of the proprietary SPSS model vs open source R?
The Open Source development model is a proven vehicle for rapidly creating innovative software. There’s simply nothing that compares to the resources and talent of thousands of world-class data scientists, all around the world, continuously pushing the boundaries by implementing the latest techniques in data analysis, visualization and predictive models in the R environment. No closed-source development team can compete with that.
And the R community itself brings further benefits. By most estimates, there are two million regular users of R software in the open source community. There is a ready-made community that is already fully acquainted with R. It has been taught to the academics who have in turn taught it to their students, and it's increasingly being used as a modeling tool in business and government.
While R has become the training tool of choice, it is still a very specialized tool accessible to well trained statisticians and data analysts. What a proprietary model like SPSS offers are features that help make the solution more mainstream. For example, SPSS was really among the first to deliver rich GUIs that make it easier to use by more people. This is why one of the first things you’ll see from REvolution is a GUI for R – to make R more accessible and hereby further accelerate adoption.
As a political science professor, you obviously appreciate R's strong world-wide connection with academia. We now see statistics students learning R as part of their graduate studies, much like students 30 years ago learned SAS and SPSS. How will this academic connection translate to commercial market share over time? Is this a leading indicator of R and Revolution success in the future?
It's an interesting phenomenon. Even with the existence of well-established commercial products serving the statistics industry, we are seeing a huge generational shift from SAS and SPSS to R. R has transcended engineering and is rapidly expanding into other disciplines – from business schools to economics to the social sciences. R is catching on across a wide array of disciplines.
Students who are trained on a more powerful programming language can leverage that as they transition into professional data analysts; they were trained on R and they will ask for R as professionals. This is the same market-seeding strategy we employed with great success at SPSS over 40 years ago.
Additionally, academic careers tend to be married to part-time consulting positions. These professors who are now fully behind R will recommend these tools to their business clients. Academics are thought leaders who bring new tools to the enterprise.
In short, R's current rise in the academic world is a strong indicator of an even larger generational shift that will occur in the business world. REvolution R has removed major barriers for R adoption beyond academia to industry by making R run faster and allowing it to handle greater volumes of data.
Steve also blogs at Miller.OpenBI.com.