Open Thoughts on Analytics
for Information Management Blogs
FEB 7, 2012 9:02am ET

Blogroll

Omniscope and R

Print
Reprints
Email

Ask just about any data scientist or science of business BI specialist about critical components of her job, and she’s almost certain to mention data integration (munging), statistical analysis and visualization at the top of her list. Once the data are in place, the attention often turns from integration to the powerful combination of statistical analysis and visualization for discovery/exploration.

I can’t imagine there’s a bigger advocate of the R platform for analytics and data programming than me. I love R’s core statistical graphics and add-on lattice and ggplot statistical packages as well, using them all the time in my work. But I’m less enthused about R’s interactive visualization capabilities. My take is that commercially-available software trumps its open source competition at this point. Frustratingly, though, exploratory data analysis nirvana for me is a robust visualization platform that interoperates with R.

Over the past few months I’ve been evaluating visualization software from Tableau, Spotfire, Qlikview, Advizor Solutions and Visokio. There’s plenty to like about each of the vendors’ products. In fact, I wouldn’t discourage potential customers from any of those choices.

The new demo release of Omniscope from UK vendor Visokio especially caught my eye. An email I received from the product manager mentioned new R integration and provided links to several recorded demos explaining how it works. Though excited, I made a mental concession that I’d be happy if the capability simply allowed me to access native R data in Omniscope – essentially, an ODBC or JDBC for R. At the same time, after having evaluated what appears to be similar interoperability between Spotfire and R a year and a half ago, I was interested in how the Omniscope/R connectivity compared. I’m glad I invested the time to find out.

Omniscope is somewhat distinctive among its peers. The software has two major product components: DataExplorer, which provides data discovery & analysis, reporting and dashboarding, and DataManager, which offers tools to build and manage data sets. DataManager is essentially a poor man’s ETL tool, providing a drag and drop visual workflow to drive data extraction, merge, transformation and delivery on a small scale. When I first started looking at Omniscope, I pooh-poohed DataManager, thinking I’d never need it. After all, what can’t I do with Pentaho Data Integration and Ruby? I later started to appreciate DM quite a bit.

Once Omniscope determines the installed R directory structure, programmers can write R code in DataManager for either data access (Source) or data manipulation (Operations). I first took on the simpler Source, learning how to use the editor to develop a script that loads a 5.4M record R data frame into memory, ultimately returning a random sample of 100,000 records and predictions from a cubic splines regression model to DataExplorer for exploration. Flush with that success, I then coded the beginnings of several generic tools that drive from R meta-data and dynamic statement building. I was getting quite comfortable using R as a DataManager Source.

Operations extends the programming power of R to other data in a DataManager stream. As an illustration, I load a comma-delimited file and then link the contents to an R script that pivots the text data using functions from the R reshape package. After a few more R programming statements, the “munged” data is returned to DataExplorer for discovery. I’ve tried similarly reshaping/filtering/enhancing input data from several other NBER files, all with positive results. While I’m sure I could do the pivoting tasks using other DataManager Operations functions, I was able to accomplish everything I needed to do with the R language I know.

My next test was to link several R Operations tasks in succession: the first to read an already-input stacked time series data file and create new variables; the second to reshape and restrict the resulting stream; and the third to invoke the Holt Winters forecasting function to “predict” the next 30 days of measurements for each of the selected series. This, too, worked well. The forecasts behaved as expected and I was able to examine the results visually in DataExplorer. Cool stuff.

Though there are a few gotchas like missing data surprises, based on my tests to date, count me as an enthusiastic supporter of the interoperability of R and Omniscope. The combination indeed makes for a powerful data science exploration platform. And for Agile BI at its finest.

Kudos to Visokio for a job well done. Me? I’m ecstatic. I can now have my interactive visualization cake and eat my R for statistical analysis, too!

Advertisement

Comments (2)
Hi Steve,

excellent post, but have you ever looked at http://rosuda.org/mondrian/? Open Source data visualization tool that also supports loading data directly from R workspaces

best, Jos

Posted by Jos v | Saturday, February 11 2012 at 1:50PM ET
Jos --

Thanks for the comment.

I actually have worked with Mondrian, and wrote an article on it 3 years ago. I haven't looked at later releases, and though I generally liked what I saw then, I wouldn't put it in the class of Tableau or Omniscope now.

If I'm not mistaken, there were some memory limitations. And while I remember reading R data, I don't recall the ability to use the R language on non-R data like you can in Omniscope. R in Omniscope is thus a powerful "munging" language.

Best, Steve

Posted by steve m | Sunday, February 12 2012 at 8:18AM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Blog Archive for Steve Miller

Politics of Data Models and Mining
SAS, WPL Code Competition May Heat Up
SAS vs. R: Statistical Modeling Rivalry Renewed
Machine Learning Hits the Books
Modeling an IT Earnings Disparity

More from Steve Miller »

Blog Index »

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.