R/Finance 2015, the seventh annual conference of applied finance using R, was held May 29-30 at the University of Illinois, Chicago. Over 300 professionals from around the world, many, quantitative finance luminaries, participated. Just my luck: I was unable to attend on  gorgeous Friday, but battled the cold, wind and rain enroute to UIC Saturday. The sog was well worthwhile.

The conference showed the strength of a vibrant open source community. Almost every presentation included a discussion of theory, computational methods and freely-available R packages implementing the techniques.

I enjoyed the little I understood of the quantitative finance prezes by university professors, students and practitioners. Yet the math behind several of the presentations, while focused on finance, was also relevant to data science applications. As a bonus, there were updates on some of the latest developments in core R data management and predictive models.

Mathematician Bryan Lewis discussed singular value decomposition applications in finance serving the three masters of accuracy, regularization and efficiency. Those in the machine learning world have no doubt seen SVD in support similar objectives involving optimization with their predictive algorithms.

R/Finance committeman and R-C++ integration guru Dirk Eddelbuettel regaled the audience with discussion of the rplpapi API that can be used for financial data from providers such as Bloomberg. Want to get a charge out of Dirk? Tell him Microsoft is superior to Linux for development.

Cornell grad student Nicholas James' "Efficient Multivariate Analysis of Change Points" hit home as a reasonable means of detecting time series data inflection points. I'm planning on taking his R package ecp -- Non-Parametric Multiple Change-Point Analysis of Multivariate Data – for a test drive.

James' Cornell classmate, William Nicholson, spoke on "Structured Regularization for Large Vector Autoregression". One of the techniques he discussed, the Lasso, is quite familiar to ML practitioners. Nicholson's R package, BigVAR, seems worth a look as well.

Sitting at the same table. I had the opportunity to converse extensively with presenter Mark Seligman, author of the Rborist package, a high-performance implementation of the random forest machine learning algorithm. I've been smitten by R's randomForest package for years,  but have ultimately been frustrated by its inability to scale. Rborist, with algorithmic advances as well as GPU support, has already delivered half an order of magnitude size and performance boosts in my limited testing. Rborist is admittedly green, but I'm more than willing to contribute testing muscle to complement Seligman's development prowess.

R package developer extraordinaire and Rstudio executive Hadley Wickham spoke on data ingest with R. Wickham's a prolific developer, well known for his ggplot2 graphical and dplyr data manipulation packages, in addition to being a terrific writer and presenter. Data ingest is all about R's core DBI database package, as well Wickhams's flat file, excel, and statistical ingestion packages readr, readxl and haven. Wickham's clearly making his mark extending R functionality to compete with Python in the messy world of data science. I'm closely watching advancement of rvest, a new package patterned after Python's BeautifulSoup  "to make it easy to  download, then manipulate, both html and xml.”

Matt Dowle presented on his outstanding data.table package that on some levels competes with Wickham's dplyr to the benefit of the R community. As I've noted before, I'm hooked on data.table, finding it a highly-productive tool for ingesting, managing and manipulating 10+ GB data sets with my notebook. On a current predictive analytics project, data.table has been indispensable identifying data anomalies that need to be be addressed before the modeling phase.

Kudos on a seventh consecutive outstanding job to the R/Finance Committee, headed by Gib Bassett, Professor Emeritus of Finance, Interim Department Head, Director of the International Center for Futures and Derivatives, as well as  Jeffrey Ryan, Dirk Eddelbuettel, Dale Rosenthal, Brian Peterson, Peter Carl, and Josh Ulrich.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access