SAS, WPL Code Competition May Heat Up
The World Programming System (WPS) SAS language clone has been front and center with me the last couple of weeks. Open Thoughts on Analytics readers might remember my praise for the low-cost competitor to the basic SAS language that became ubiquitous in the statistical world.
WPS emerged a few years ago as an alternative that can dramatically reduce annual license premiums by companies invested in SAS data step programming. And in theory, replacing SAS with WPS for SAS programming tasks is pretty seamless. For my own testing, WPS has worked perfectly reading SAS data sets and handling all data step and macro code I’ve thrown at it.
For the past few years, World Programming Ltd, WPL, the company that produces WPS, has been embroiled in litigation with SAS over purported copyright violations with their compiler clone. The European Union Court of Justice ruled favorably for WPL last week, noting that “There is no copyright infringement” when a software company without access to a program’s source code “studied, observed and tested that program in order to reproduce its functionality in a second program”. Tellingly, the court opined that ‘software companies can’t rely on copyright rules to prevent rivals from “reverse engineering” computer programs’.
Stay tuned. You can certainly bet that the legal battles between SAS and WPL aren’t over. You can also figure that SAS will take WPL seriously as a competitor, using its muscle to combat the pesky startup. I hope for sustained competition that’ll benefit all statistical analysis consumers.
A week or so ago, I got an email from Phil Rack asking if I’d like to take a look at the latest release of his software product, Bridge2R, that facilitates interoperation between WPS (or SAS) and R. Phil and I share a lot of statistical history, first in SAS and now in R and WPS. His company, Minequest, is currently a reseller of WPS.
After I ftp’d the software and started taking a look, I realized the bridge was primarily WPS/SAS macro code. Having almost gone cross-eyed years ago trying to make sense of SAS macro special characters “%” and “&”, part of me thought run for the nearest exit. But I persisted.
Bridge2R’s primary charter is pretty simple: facilitate the sharing of WPS data sets with R, and R data frames with WPS. Let’s say you have a WPS (or SAS) library and would like to use R’s ggplot graphics on one of the data sets. In theory, just invoke the Bridge2R start macro function with the data set name and, voila, a corresponding R data frame is created for the duration of your WPS session. If you want to persist the newly-created structure, simply save it in an R data file. Going from R to WPS is just as straightforward, the libname prefix determining whether the data set is permanent.
There were a few minor glitches getting started with Bridge2R. I first needed to upgrade my WPS installation to a maintenance release. No problem. I then discovered I had to use the libname prefix for all WPS data sets, even the temporary “work” ones. Phil and I are still scratching heads over that, though it’s probably a setup problem on my side. I’ll figure it out later.
After those issues were quickly resolved, Bridge2R worked well. I was able to move data between WPS and R with ease, testing my usual suspect data files. The final script involved invoking proc summary on a 52 million record natality data set, feeding the summarized output to R for visualization using its powerful lattice graphics package. It felt nice being able to divvy up the statistical labor so seamlessly between the two powerful products.
For those in organizations seeking a comprehensive yet low-cost statistical platform, the combination of WPS and R linked by the Bridge2R is definitely worth a look. WPS can read SAS data sets and handle legacy SAS data step and macro code with aplomb, perhaps leading to savings on annual SAS licensing fees.
And while WPS now offers only a fraction of SAS’s statistical capabilities, the procs it does feature, such as univariate, summary, tabulate, sql, print, anova, freq, reg and logistic, are workhorse mainstays. WPS also provides easy connectivity to a variety of databases. The statistical procedures it doesn’t support (plus many others) can be found in R, with Bridge2R providing the means to share WPS data with R and vice-versa. If you’re on a strict statistical budget, this tandem could well be a winner.