Ignoring my fears, I jumped in head first, attempting to scale the learning curve of S-PLUS 8 by porting two extensive R scripts that involved connectivity to the Web, database access, programming, statistical models and graphics. To my surprise, I was actually able to successfully convert the scripts in a reasonable amount of time. The effort was by no means seamless: there were incompatible functions unique to each dialect, parameters to functions unique to each, and even different behaviors in common functions. All told, however, the conversion was much smoother than I remembered - even with the more complicated scripts. As it turns out, this initial success was a harbinger of what's new in S-PLUS 8.
The new S-PLUS 8 Enterprise edition comes with a more extensive Eclipse-based IDE and debugger, and expands the critical-for-BI capabilities of the big data library for handling data sets larger than memory. There are over a hundred new functions in S-PLUS 8, many of which provide compatibility with R. Frustrated R programmers will now find the convenient "with" function in S-PLUS 8, for example. A major enhancement to S-PLUS 8 is the new package system, a powerful framework for developing and distributing S-PLUS code libraries. This will be a boon for S-PLUS analysts, promoting access to many of the latest statistical procedures developed in the academic and research worlds.
Of the many attractive developments in S-PLUS-8, none is more significant than the movement to align the disparate S-PLUS and R dialects of the S language. Because both languages are in motion independently, it will be no easy feat to unite them - think of Sybase and Microsoft SQL Server code bases that forked many years ago. But at the same time, there appears to be a commitment from both the R community and Insightful to collaborate for the common statistical good. Indeed, S-PLUS 8's new package support accommodates both S-PLUS and R language dialects, so now the S-PLUS community can extend the language much like R. And S-PLUS 8 provides tools to convert R packages to work within the S-PLUS environment. This is a big lift for S-PLUS, since R is lingua franca of academic statistical computing, with many of the latest methods making their debut as R packages available for distribution. I am currently testing Rpart and RandomForest, two R regression tree packages extremely popular with R users, and just recently migrated to S-PLUS 8.
While the evolving collaboration between Insightful and the R community offers the S world major benefits of both the commercial and open source (OS) software models, it is at same time fraught with challenges and pitfalls. The commercial and OS models of software development are very different and seemingly incompatible. The currency of volunteer open source is recognition and status within the community in contrast to monetary compensation with commercial peers. And of course, commercial and open source licensing models vary 180 degrees on cost, usage rights and source code access. Ask an open source community about commercial, proprietary software and they cringe over exorbitant licensing costs, substandard product quality, limited innovation, and lack of access to source code. Ask a proprietary vendor about open source, on the other hand, and they lament the absence of leadership, strategic focus, business discipline and product support of OS initiatives, citing difficulties of "herding cats" in the OS community - all to the added risk of product adopters. These conflicting perspectives can create a good deal of distrust between the communities, especially if the communities are competing with the "same" product.
Despite the significant challenges, the potential benefits of aligned Insightful and R communities are substantial. Done correctly, the commercial open source (COS) model that could well be the outcome of this collaboration can combine the accelerated innovation and enhanced product quality of open source with the strategic focus, support and indemnification of the commercial model - all the while cutting costs for consumers.1
There are examples of successful COS software companies to guide Insightful/R. Red Hat is the prototype for COS success with infrastructure, and MySQL is preparing for an IPO to hasten its expansion in the database market, while Pentaho and JasperSoft are gaining recognition in the BI space. There are also strong natural synergies between Insightful and R. In the utopian COS statistical world of aligned S-PLUS and R, practitioners would benefit from the worldwide support of an incredibly talented user community, simultaneously deploying modules showcasing the latest statistical and quantitative techniques from the centers of research and academia. In that same utopian world, pragmatic CIOs looking to deploy analytics around their business enterprise could sleep soundly knowing their statistical investment is supported and indemnified, also confident their requests for product enhancement (such as handling large data sets) will be heard and hit the development priority queue.
A new COS-based statistical software company promoting the best of open source R and commercial S-PLUS would not go lacking for proposed product enhancements to serve an increasingly demanding BI community. The original S language developed more than 20 years ago limits data set sizes to something less than available RAM. For most traditional statistical applications, a 1 gig data frame limit is more than adequate. For data mining and business intelligence applications of 2007, however, that limitation is quite restrictive, often forcing analysts to artificially restructure their work. Insightful started to redress the problem with S-PLUS 7, providing a "big data" library of functions to work with large data sets. Though big data provides substantial relief from size limitations by using virtual memory, the solution is far from seamless because users must invoke special big data functions to get the benefit. Alas, big data and not-so-big data don't always interoperate well. A "larger" solution would certainly involve re-architecture of the core platform - an impossible task without the strong collaboration from both the R and S-PLUS camps.