FCC Uses S-PLUS for Sophisticated Statistical Analysis

  • July 01 1998, 1:00am EDT

PLATFORMS: S-PLUS is running on Windows 95, H-P P90.

BACKGROUND: In order to make informed decisions, the Federal Communications Commission requires economic analyses of the industries it regulates. Frequently these analyses are of a statistical or econometric nature. S-Plus has been increasingly used by the Commission's economists to solve problems that other software can't.

PRODUCT FUNCTIONALITY: The Commission regulates over 10,000 radio stations. Frequently it is necessary to examine a sample of radio stations' characteristics. This can easily done using SQL. We needed a sample based not only on station characteristics, but upon statistical criteria which cannot be computed by SQL. The prospect of shuffling so much data back and forth between a statistical package and SQL was daunting. However, S-Plus could easily accomplish our goal, as if it were a hybrid of SQL and a statistical package. Of course, S- PLUS does much more. I regularly use S-Plus for sophisticated econometric analyses and for graphical exploration of data.

STRENGTHS: The ability to program modules which are more easily substitutable than traditional subroutines makes combining different parts of two programs to form a third much easier than in a traditional programming environment. Programmers coming from compiled environments might find that lazy evaluation takes some getting used to. S-Plus is extremely accurate. Recently the National Institute of Standards and Technology released its Statistical Reference Datasets (StRD), a set of test problems designed specifically to evaluate the accuracy of statistical packages. I have personally run a dozen packages through the StRD, including many of the more popular ones, and none has approached S-Plus in terms of accuracy.

WEAKNESSES: The flexibility of the S language can sometimes make it difficult to figure out how to write a program. The learning curve is flat: it can take quite some time to attain a modicum of proficiency. This problem does not arise when using the command language to run a regression, but when modifying the regression command to do something extra. This problem is eased by the existence of pull-down menus to execute a wide variety of procedures without having to resort to the command language. (This feature was new to version 4.0 and has been further enhanced in 4.5.) The object-oriented nature of the S language means that repetitive procedures which cannot be vectorized are quite slow. Fortunately, S has several vectorizable functions, so it is not necessary to loop nearly as often as in Fortran. A final weakness is that while S-PLUS is available for Microsoft and UNIX platforms, it has yet to be ported to Linux, though a Linux port is expected soon.

SELECTION CRITERIA: I am an econometrician, and I firmly believe that data should be explored before any statistical test is conducted. I purchased S-PLUS solely because it implements the graphical analysis of data algorithms. I quickly discovered that I could do all my econometric work in S-PLUS, and so it became my primary package.

VENDOR SUPPORT: Support is quite good for traditional questions. However, since the S language is so powerful, S-PLUS is also used for many nontraditional and highly specialized purposes. It is impossible for the support staff to be expert in all areas at the cutting edge of statistics. Therefore, the S-News mailing list, which resides at the Division of Biostatistics at Washington University, serves as an important complement to S-Plus Support. S-PLUS users regularly post their own problems and answer the problems posted by others.

DOCUMENTATION: The vendor's documentation is geared toward use of the pull-down menus and for this is it fine. Those who prefer to write in the command language will find it difficult to prosper using only the vendor documentation. However, no one takes this route, and third-party texts play a much larger role in the S-PLUS user community than for any other package with which I am familiar.

