Open Thoughts on Analytics
JAN 6, 2014 2:39pm ET

Related Links

Gartner: CEOs Focus on Tech-Related Business Growth in 2014
April 17, 2014
How Not To Do Big Data in Health Care
April 11, 2014
Collaboration Important to Big Data Analytics Success
April 9, 2014

Web Seminars

April 29: Create a data protection strategy with open, software-defined storage
April 29, 2014
New Best Practices To Manage Customer Information
May 7, 2014
May 13: Cost-effective, scale-out backup in 1 solution
May 13, 2014

Still More R and Python


I canít get enough of Python and R. My last blog of 2013 extolled Python; I wrote a flattering R-Python piece†several months ago; and Iíve authored countless articles on R over the years. Itís safe to say Python and R are my favorite programming languages.

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?

Filed under:


Comments (5)
Python does not suffer from the same memory problems as it has an inherently different viewing mechanism. Some write about lacking graphics capabilities, but I wonder if any of you have heard of matplotlib, bokeh, vince, or the ggplot port. Another concern I have about people's comparisons between R and Python is the ever-present Pandas mention, as people are doing very impressive analysis with structured numpy arrays straight into scikit-learn. Even Pandas' libraries author will be the first to tell you it's meant to be medium data tool, but I see a lot of former R users afraid to leave the dataframe paradigm and make it seem that if Pandas can't do it 'Python' can't do it. I'd rather spend my development effort in Python than in a language where a handful of people are rewriting half of an inherently slow language's functionality. Keep an eye on Julia.
Posted by Kevin D | Wednesday, January 08 2014 at 10:48AM ET
matplotlib, bokeh and d3py are nice Python graphics libraries, but they won't satisfy statistical needs like R's lattice and ggplot2 packages (think trellis) without lots of add-on programming. Python's ggplot port is promising, but still in its infancy.

Similarly, those wishing to use numpy and scipy directly, without pandas, can do so if they're willing to program. pandas is becoming a de facto standard data management/data analysis library for Python, recognized as such by many other libraries (think ggplot), and saving analysts lots of time managing data they can then use doing other work. As a data scientist, I enjoy programming, but prefer spending more time on statistics and machine learning.

Key to both R and the evolving Python statistical ecosystem for me is productive foundational tools that take away much of the programming tedium.

Posted by steve m | Wednesday, January 08 2014 at 2:39PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Are you actively evaluating master data management technologies and their ability to scale and support emerging trends around big data, social and mobile?

Yes 61%
No 23%
Don't Know 9%
Not Applicable 6%


Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
Please note you must now log in with your email address and password.