A few months ago, I conversed with friends at Revolution Analytics about “pretty” big data for statistical analysis. David Smith and Joe Rickert of RA had read one of my blogs and expressed interest in the multi-million record census data set I’d built from the Public Use Microdata Sample (PUMS).

I, on the other hand, was intrigued by the 120M+ record file RA was using to showcase their RevosScaleR big data statistical analysis capabilities. Turns out the “airline on time performance data” they analyzed was accessible from an American Statistical Association website, input for an ASA competition on statistical computing and graphics. I decided to see what I could come up with from the data set – though well after the contest had concluded.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access