Open Thoughts on Analytics
for Information Management Blogs
SEP 13, 2012 5:45pm ET

Blogroll

blog

Data Science U

Print
Reprints
Email

Given the growing popularity of business intelligence and analytics as focuses of new academic programs, I guess I shouldn’t be surprised that data science is now beginning to show up on the university radar as well.

This fall, prestigious Columbia University in New York City is offering a course entitled “Introduction to Data Science,” taught by a team under the direction of Google Statistician and Columbia Assistant Professor Rachel Schutt. The class is an outgrowth of the recently-created Institute for Data Sciences and Engineering, a joint initiative between Columbia and New York City.

According to the class Web page, “This course is an introduction to the interdisciplinary and emerging field of data science, which lies at the intersection of statistics, computer science, data visualization and the social sciences.” At the end of the semester, students should understand what it’s like to be a data scientist and “be able to do some of what a data scientist does.” Prerequisites include some knowledge of linear algebra and basic probability and statistics as well as basic programming skills

Instruction themes revolve on statistics/machine learning, data programming languages and big data tools that are embellished by case studies. Delivery consists of core content lectures, labs and guest presentations from selected data science experts. The initial class of over 60 includes students from a variety of disciplines representing undergrads, graduate students and faculty. The course attempts to address the needs of both pre/working professionals and academic researchers.

Each week highlights a DS topic reinforced by a statistical/ML content lecture from Schutt, a related lab exercise, and an illustrative guest lecture by a DS practitioner. Topics include exploratory data analysis, visualization, supervised and unsupervised learning, logistic regression, decision trees, time series, sampling, experimental design, recommendation engines, causal modeling, social network analysis, data journalism and data engineering obsessed with big data.

The recommended class texts are the usual outstanding data science suspects on machine learning, probability and statistical analysis, programming with R, Python and Hadoop, and visualization. 80% of the course grade is determined by performance on homework assignments and a team-based class project modeled on Kaggle competitions. Assignments and project work are generally completed in R and Python. Assignment 1 introduces R for basic data programming and visualization, and has students develop a data strategy for RealDirect, a website designed “to make selling and buying a home easier.”

“Introduction to Data Science” looks to be a great start to a curriculum that would legitimize data science as an area of academic inquiry. Schutt acknowledges that IDS is a version one product that will evolve. But as a stake in the ground, IDS appears to cover most data science bases.

My bet is that there’ll be a one year Masters in Data Science program emerging from the Institute for Data Sciences and Engineering in the near future. What would also be nice is an undergraduate certificate program involving, say, three or four core data science courses for quantitatively-oriented students. Instruction would be in applied statistics, machine learning, data/statistical programming and visualization. Sign me up!

Advertisement

Comments (3)
I've noticed quite a number of courses pop up on Coursera, which is super for picking up the skills as a working professional, outside the US. So, free access to Stanford courses with R and analytics is another avenue.
Posted by Nathaniel U | Wednesday, September 19 2012 at 10:10PM ET
Nathaniel: Indeed--Coursera's set to offer a course from UW called "Introduction to Data Science" next April. Can't wait!
Posted by Jeffrey T | Tuesday, September 25 2012 at 6:20PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Blog Archive for Steve Miller

Tableau, Python and R
The Data and Bias of Macroeconomics
No Quick Death for Statistical Practices
Getting Started with Statistical Learning
The Big Data Revolution: Part 2

More from Steve Miller »

Blog Index »

Where do young IT professionals (30 and under) obtain information to aid with daily role responsibilities and career development?

Trade publication websites 14%
Social media 23%
Vendor websites 4%
Vendor/community forums 7%
Newsletters 1%
Trade conferences/meetups 2%
RSS feeds 6%
Web search 44%

 

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.