My partners and I “debate” each year what are the “best” majors for our data science and BI consulting business. Not surprisingly, the computer science partner thinks CS is best, the information systems partner prefers his major, which is often housed in business schools, and the engineering guy likes industrial and electrical engineers. A stats major, I prefer applied math/stats/econ/OR. We all agree that whatever the major, a quantitative orientation with strong programming, database and communication skills is essential.
Undergrad is often just a start now though; many BI and DS professionals pursue advanced degrees, either before or after they enter the workforce. I’ve written on several graduate predictive analytics programs recently that are pertinent for our craft. I also like the new MS in Computational Finance from the University of Washington as a training ground for quants. I’ve even detailed my own version of an applied statistics Masters’ program.
As I re-read that year and a half old blog, I realize that while the stats part is strong, I now think the program’s short on computational focus, especially given the ascent of big data, the Hadoop ecosystem and ever-evolving machine learning algorithms. My 2012 version of that curriculum would probably sacrifice a few stats courses for computer science and optimization alternatives.
Along those lines, I recently came across mention of a new master’s program in Computational Science from the second best engineering school in Cambridge, MA that looks quite relevant for budding data scientists. “A new master’s degree program in Computational Science and Engineering (CSE) will be launched at Harvard during the coming academic year, with the aim of training new leaders for a future where large-scale computation and advanced mathematical modeling will propel discovery and innovation in fields from psychology to photonics…. The Harvard program will offer a curriculum broader than typical for master’s degrees in computational science, anchored by core courses in both computer science and applied mathematics and embracing a wide range of applications, including the social sciences in particular.”
The one-year curriculum gets a strong endorsement from Jennifer Chayes, Distinguished Scientist and Managing Director of Microsoft Research New England and Microsoft Research New York City. “Many of the defining questions of this era in science and technology will be centered on ‘big data’ and machine learning. This master’s program will prepare students to answer those questions by integrating and applying computation and engineering with other disciplines, including both physical and social sciences.”
Core courses include Foundations of Computational Science, Systems Development, Numerical Methods, and Stochastic Optimization. Among the many computational and statistical electives are Bayesian Data Analysis, Statistical Computing and Learning, Statistical Machine Learning, Efficient Algorithms, Computational Learning Theory and Advanced Machine Learning. Students can also branch out to offerings from top Harvard departments in Economics, Government, Health Sciences and Organismic and Evolutionary Biology (yikes!)
On paper, I like this curriculum a lot as a foundation for data science. CSE graduates should be well-versed in programming, computation, numerical methods, optimization, statistics and applications to the social, health or natural sciences. A social science angle is particularly intriguing to me since business behavior does, after all, follow basic economic and behavioral principles.
Of course, a limited-enrollment advanced degree program from Harvard is not accessible to most. What’d be nice is if the CSE curriculum were administered online, providing opportunities to a larger group of geographically-dispersed students, much like Northwestern’s new online predictive analytics program. Perhaps also, Harvard’s cachet will entice other schools to follow their direction and offer similar programs. The Harvard pedigree won’t hurt a bit.