Ruby Monday – Part 1 introduced the agile programming language Ruby as my current favorite “data munging” tool in the tradition of Awk, Perl and Python. Today's blog embellishes on Ruby's suitability for basic data management tasks that precede BI and analytics. In next week's finale, I'll outline the “Ruby solution” to my stock portfolio returns simulation.

For data manipulation tasks, Ruby has modern control structures, powerful data constructs and versatile functions/methods that make it easy to craft “munging” programs. Full-blown regular expression syntax and exhaustive string-handling capabilities facilitate text search and change. With powerful Ruby containers/collections – arrays, enumerables and hashes – it's easy, for example, to organize complicated data in memory, pivot text files, merge data sets, compute frequencies and cross tabs, and do complex array calculations with set and range manipulation. Ruby data structures make sorting and control break processing simple, and posting delimited output files a snap. The built-in libraries include functions to manage operating system files, directories and IO, making Ruby quite suitable for system administration. Ruby also works deftly with the outside world, exploiting its ability to consume run-time arguments and manage processes with pipe and fork commands from the OS. I often use these capabilities to “wrapper” R functions and invoke the statistical package in batch from the command line. A powerful exception handling capability is built in, while blocks and iterators promote code economy, supplanting loops for most aficionados. Indeed, the more I work with the language, the more convinced I become that short of a sophisticated ETL environment, Ruby's an ideal choice for BI data movement program development.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access