Open Thoughts on Analytics
JAN 6, 2014 2:39pm ET

Related Links

Big Data is More Than Just Hype
September 18, 2014
Big Data Gets Bigger Footprint in Insurance
September 16, 2014
Data Acceleration: A Technology Architecture for High Speed Insights
September 12, 2014

Web Seminars

Essential Guide to Using Data Virtualization for Big Data Analytics
September 24, 2014
Integrating Relational Database Data with NoSQL Database Data
October 23, 2014

Still More R and Python


I canít get enough of Python and R. My last blog of 2013 extolled Python; I wrote a flattering R-Python piece†several months ago; and Iíve authored countless articles on R over the years. Itís safe to say Python and R are my favorite programming languages.

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?

Filed under:


Comments (6)
Python does not suffer from the same memory problems as it has an inherently different viewing mechanism. Some write about lacking graphics capabilities, but I wonder if any of you have heard of matplotlib, bokeh, vince, or the ggplot port. Another concern I have about people's comparisons between R and Python is the ever-present Pandas mention, as people are doing very impressive analysis with structured numpy arrays straight into scikit-learn. Even Pandas' libraries author will be the first to tell you it's meant to be medium data tool, but I see a lot of former R users afraid to leave the dataframe paradigm and make it seem that if Pandas can't do it 'Python' can't do it. I'd rather spend my development effort in Python than in a language where a handful of people are rewriting half of an inherently slow language's functionality. Keep an eye on Julia.
Posted by Kevin D | Wednesday, January 08 2014 at 10:48AM ET
matplotlib, bokeh and d3py are nice Python graphics libraries, but they won't satisfy statistical needs like R's lattice and ggplot2 packages (think trellis) without lots of add-on programming. Python's ggplot port is promising, but still in its infancy.

Similarly, those wishing to use numpy and scipy directly, without pandas, can do so if they're willing to program. pandas is becoming a de facto standard data management/data analysis library for Python, recognized as such by many other libraries (think ggplot), and saving analysts lots of time managing data they can then use doing other work. As a data scientist, I enjoy programming, but prefer spending more time on statistics and machine learning.

Key to both R and the evolving Python statistical ecosystem for me is productive foundational tools that take away much of the programming tedium.

Posted by steve m | Wednesday, January 08 2014 at 2:39PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
Please note you must now log in with your email address and password.