Open Thoughts on Analytics
JAN 6, 2014 2:39pm ET

Related Links

Cutting Through The Clutter
July 22, 2014
Gartner: ‘Cognizant Computing’ to Become a Force in Consumer IT
July 21, 2014
Ex-Merrill Lynch Banker Uses Big Data to Save Pubs
July 21, 2014

Web Seminars

How Customer Analytics Can Lower Costs and Raise Revenue
July 29, 2014
Improve Omni-channel Shopping Experience with Product Information Management
August 21, 2014
Blog

Still More R and Python

Print
Reprints
Email

I canít get enough of Python and R. My last blog of 2013 extolled Python; I wrote a flattering R-Python piece†several months ago; and Iíve authored countless articles on R over the years. Itís safe to say Python and R are my favorite programming languages.

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to information-management.com including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?

Filed under:

Advertisement

Comments (6)
Python does not suffer from the same memory problems as it has an inherently different viewing mechanism. Some write about lacking graphics capabilities, but I wonder if any of you have heard of matplotlib, bokeh, vince, or the ggplot port. Another concern I have about people's comparisons between R and Python is the ever-present Pandas mention, as people are doing very impressive analysis with structured numpy arrays straight into scikit-learn. Even Pandas' libraries author will be the first to tell you it's meant to be medium data tool, but I see a lot of former R users afraid to leave the dataframe paradigm and make it seem that if Pandas can't do it 'Python' can't do it. I'd rather spend my development effort in Python than in a language where a handful of people are rewriting half of an inherently slow language's functionality. Keep an eye on Julia.
Posted by Kevin D | Wednesday, January 08 2014 at 10:48AM ET
matplotlib, bokeh and d3py are nice Python graphics libraries, but they won't satisfy statistical needs like R's lattice and ggplot2 packages (think trellis) without lots of add-on programming. Python's ggplot port is promising, but still in its infancy.

Similarly, those wishing to use numpy and scipy directly, without pandas, can do so if they're willing to program. pandas is becoming a de facto standard data management/data analysis library for Python, recognized as such by many other libraries (think ggplot), and saving analysts lots of time managing data they can then use doing other work. As a data scientist, I enjoy programming, but prefer spending more time on statistics and machine learning.

Key to both R and the evolving Python statistical ecosystem for me is productive foundational tools that take away much of the programming tedium.

Posted by steve m | Wednesday, January 08 2014 at 2:39PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.