I had the pleasure of participating in R/Finance 2014 at the University of Illinois, Chicago last Friday and Saturday. It was a “home game” for me, a quick commute from the suburbs for the sixth rendition of a world-class conference. Alas, some of the 300 international participants were less than thrilled with Chicago's mid-May weather. If only the annual conference were held in July....
I'm not a finance quant, but my stats-computation background at least kept me in the understanding game. While I might not get the finance esoterica, underlying statistical concepts like the bootstrap, ARIMA, regularization and MCMC at least resonate. One of the organizing committee members projected that authors of more than half the many R finance packages were in attendance at the conference. Impressive.
The “typical” 20-minute finance model presentation first laid out the conceptual problem, then was followed by a mathematical/statistical model formulation. Next came the introduction of relevant data, after which was discussion of detailed R code for analysis. These computations were more often than not formalized into an R package made available to the community.
An illustration of the “model” was Gregor Kastner's excellent “Dealing with Stochastic Volatility in Time Series Using the R Package stochvol”. “The R package stochvol provides a fully Bayesian implementation of heteroskedasticity by means of the stochastic volatility framework....The software described in this paper utilizes Markov chain Monte Carlo (MCMC) samplers to conduct inference by obtaining draws from the posterior distribution of parameters and latent variables, which can then be used to predict future volatilities.” Yikes.
A collateral benefit of attending an international R conference is learning about the latest developments of the platform from the perspectives of community contributors, the core team and for-profit vendors, and that certainly was the case with R/Finance 2014.
My participation commenced with an early Friday morning tutorial on data.table by author Matt Dowle. I've written positively on data.table before, and found no reason to change my mind after this session. Indeed, owing to several significant performance advantages, data.table is now encroaching on the sacred data.frame as the storage structure of choice for R programmers. Think of data.frame as a traditional relational store and data.table as a columnar alternative. For analytics work, columnar is generally more efficient and better-performing than relational and that's indeed the case here. As data.table continues to add new SQL-like functionality while supporting existing data.frame uses, its stature can only grow.
R core team member and University of Iowa professor Luke Tierney offered assurances that the base R distribution is progressing as well. Advances in compilation, copy efficiency, memory management and parallelization bode well for R's future as a premier platform for data analysis. Inviting accomplished computer scientists to participate in a team now dominated by statisticians will only help.
Bill Cleveland's a data analysis rock star, his books on visualization and seminal work on trellis displays at Bell Labs establishing a standard for first the S and then R statistical packages. Indeed R's terrific lattice graphics owes a foundational debt of gratitude to Cleveland. Now a professor at Purdue, Cleveland's current work revolves on the combination of R and Hadoop as an analysis platform for big data. His keynote, “Divide and Recombine for the Analysis of Large Complex Data”, borrowed on research themes from both the past and present. His split-apply-combine methodology that looks to use conditioning, subsetting, and sampling to break big, intractable problems into smaller, solvable ones that in many cases provide “good-enough” answers, is both clever and compelling.
Kudos to UIC Finance Professor/Chairman and International Center for Futures and Derivatives Director, Gib Bassett, as well as other R/Finance committee members Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeff Ryan and Joshua Ulrich for another great show!
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access