Statistics and Financial Engineering

Register now

I missed R/Finance 2011: Applied Finance with R in late April this year. I'd been to the first two and witnessed significant growth of a new international conference.

According to sponsors, this year's edition eclipsed the second as the second did the first. And as I browse the presentations, I see plenty of innovation in mathematical/statistical finance, investment science, data analysis and R package development to excite me.

The field of financial engineering and its dialects has exploded in the last 15 years. Top universities now offer programs in FE, financial math, financial economics, finance statistics, computational finance and more. Courses are taught by business, math, finance, computer science, statistics, operations research and economics departments. An MS in Financial Engineering from a top school, generally a one-to-two year program for those fortunate enough to gain admittance (an 800 on the GRE/GMAT Math is almost mandatory), is often a ticket to a high-paying job on Wall Street.

Though I haven't had much formal training in FE statistics, I have picked up a bit over the years, if for no other reason than investment return data is so accessible. For example, I recently built a data set consisting of daily returns from all stocks in the Russell 3000 index to analyze in R. And there are comprehensive financial engineering/statistics packages available in R to support even the most far-out FE esoterica.

Indeed, I look at FE statistics in much the same way I look at psychometrics, econometrics, psychometrics, biometrics and mathematical science – all disciplines outside mainstream statistics that add valuable techniques to statistical science. Wherever I can find good analytics ...

At R/Finance 2009, I had the opportunity to listen to and meet David Ruppert, a Cornell professor and author of a very accessible introductory book, “Statistics and Finance: An Introduction,” that I'd grown to love. SF, written in 2004, features examples with code written in SAS and Matlab. When I approached Ruppert after his talk, he noted he was working on a more comprehensive book to be published in late 2010 that would replace SAS with R for the examples. I made a mental to purchase the new book when it hit the market.

Statistics and Data Analysis for Financial Engineering” is a terrific reference for those like me who prefer an applied focus to an emphasis on theory/math. Actually, I'd say the 600+ page SDA presents a nice balance of conceptual, math and applications with computer code.

The book contains intermediate level investment science material on investment returns, bonds, portolio theory and the capital asset pricing model. It also does a good job on the financial statistics front with chapters on risk management, copulas and cointegration.

But it's with the sections on core statistics that have applicability inside and outside finance that this book really shines. The chapters on exploratory data analysis and univariate distributions are terrific, demonstrating well the power of variable transformations, kernel densities and graphics to help analysts get to know their data. Of course, all techniques are supported by R packages and functions.

For those new to time series/forecasting analysis, SDA is a nice place to start. Ruppert does a good job introducing random walks, stationarity and white noise series, laying the groundwork for the autoregressive, moving average (ARIMA) models popularized by Box and Jenkins. The examples using returns data are pertinent and the discussion of seasonal models hits the mark with applied forecasters like me. Ruppert's treatment of special topics such as multivariate time series, as well as ARCH and GARCH models, clarified areas of previous confusion.

Though many stats texts cover basic linear regression models comprehensively, Ruppert shows how to apply them to the CAPM and factor models of investment science. It was comforting to discover I wasn't too far off estimating French/Fama three-factor models with data available on the Web. The advanced regression topics that include coverage of splines and smoothers should be welcomed by all analysts chartered with fitting exploratory curves to noisy data.

Like in his earlier book, Ruppert pays special attention to the computationally-intensive resampling techniques of the bootstrap and cross-validation. A special treat at the end is the solid introduction to Bayesian analysis, including the markov chain monte carlo (MCMC) estimation of posterior densities using WinBUGS and its R interface, R2WinBUGS. Yikes!

SDA is a meaty book – over 600 pages total. Ruppert certainly doesn't cheat his readers on quantity – or quality. All told, I'd give the book a solid A grade and recommend it enthusiastically as a general intermediate statistics reference – for those outside finance as well as those in.

For reprint and licensing requests for this article, click here.