A colleague recently asked me for a good introductory text on the R statistical computing platform. Though there are a seemingly endless number of published books on R, I recommended a personal favorite, Introductory Statistics with R, Second Edition (Peter Dalgaard). The book does an excellent job introducing the R language as well as demonstrating Rs usage for solving real world statistical problems.
I chuckle when I read uncomplimentary reviews of R documentation by analytics pundits. In addition to scores of books, comprehensive reference manuals, and online help and documentation, theres a wealth of R how to publications written by the community and freely available to anyone with internet search. One such gem that I recently discovered is The R Inferno by Patrick Burns. The abstract to this brief (103 pages) pdf concisely conveys the tomes goal: If you are using R and you think you're in hell, this is a map for you.
Not only is Burns a capable R analyst, hes also a very clever writer. The R Inferno is a play on the Inferno cantica of Dante Alighieris The Divine Comedy, in which Dante navigates the nine circles of hell. The circles are concentric, each progressively more depraved, representing ever increasingly grievous sins, ultimately culminating with Satan in the center of hell.
Burns sees the journey through R learning hell with a similar lens. His concentric circles depict problems that typically trip up those new to R. Much attention is focused on vectorizing computations to perform efficiently. My experience is proof positive the new R programmers often bring procedural baggage to their learning. Burns also obsesses on the many benefits of modular function development in R, as well as its various flavors of object orientation. The eighth circle, Believing It Does What is Intended, addresses scores of R gotchas, and is pertinent for even the most experienced R programmers. Finally, circle nine clearly articulates the R community-established norms for asking help of the many support lists. The uninitiated who routinely leap before they look are not treated charitably in R land.
After reading Inferno, I was prompted to look in the attic for one of my all time favorite computer books, the now 35 year old The Elements of Programming Style, by Kernighan and Plauger. (Aging analysts might recognize Brian Kernighan as co-author with Dennis Ritchie of C Programming Language, one of the most important programming books of the last 30 years.) Just as Burns uses the Divine Comedy as a metaphor for his writing, Kernighan and Plauger model the timeless and concise writing manifesto, Elements of Style, by Strunk and White, as their guide. And just as I try to remember important S&W dictums like Put sentences in a positive form, Omit needless words, and Revise and rewrite when writing, so too do I look to K&Ps wisdom -- Let the data structure the program, Dont patch bad code; rewrite it, Watch out for off-by-one errors, Make sure your code does nothing gracefully, and Make it right before you make it faster to structure programming work. Much like Elements of Style and The E lements of Programming Style, The R Inferno is destined to become a manuscript that ages well that always rewards those who invest the time to review.
Steve Miller's blog can also be found at miller.openbi.com.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access