Two and a half years ago I wrote an article for the then DMReview on OpenCourseWare at MIT and it's value for business intelligence. In a nutshell, the OpenCourseWare initiative makes available to the public materials such as lecture notes, reading lists, handouts, problem sets and tests from classes given at MIT and other top universities, creating a treasure chest of learning opportunities from some of the top academics in the world – for free.
I revisited MITOpenCourseWare a few months ago in search of a gentle computer programming course I could recommend to those interested in BI without a strong technology background. Not only did I find such a class, but also saw the progression in sophistication of available OCW materials as well. Many newer courses now include audio/video of class lectures and recitations, so participants can see/hear the instructors “live”. With this technology, the public can enjoy much of the experience of students paying $55,000/year. Good deal.
6.00 Introduction to Computer Science and Programming, taught by two eminent MIT Electrical Engineering and Computer Science (EECS) professors, uses Python to introduce programming concepts to students with modest technology backgrounds. An “agile” language with a simple syntax and high level constructs, Python is much easier for beginners than C++ or Java, the usual suspects for Introduction to Programming. Having worked with Perl, Python and now Ruby extensively, I can testify to the productivity of these tools. Agile languages can solve many problems with a fraction of the code needed for Java/C++. There's also a wealth of pre-programmed packages freely-available to extend their capabilities. For most day-to-day programming tasks, and even many sophisticated web applications, these languages are now ideal.
After the first few introductory lectures, I'm not sure this course remains so gentle, though. The instructors take advantage of the abbreviated Python syntax learning curve to expand their reach of problem solving. Students are exposed to data structures, algorithms and object-orientation, tasked to write moderately complicated programs to demonstrate their understanding. The final third of the one quarter schedule revolves on Python-programmed simulation, Monte Carlo experiments, statistical distributions and applied regression analysis, climaxing with a stock market simulation example. The instructors' focus on statistical computation, as well as their emphasis on the use of already-developed packages, are of special benefit to new BI programmers.
Before leaving EECS, check out a few other gems. Courses denoted IAP are short, Independent Activities Period curricula, with topics such as Java and C++ programming. 6.099 Street-Fighting Mathematics “teaches the art of guessing results and solving problems without doing a proof or an exact calculation. Techniques include extreme-cases reasoning, dimensional analysis, successive approximation, discretization, generalization, and pictorial analysis.” Seems quite pertinent for decision-making to me. 6042J Mathematics for Computer Science, the study of logic, simple probability and discrete mathematics, provides a terrific background for BI. Before departing MIT, review the offerings from the Sloan School of Management which, as readily guessed, has a strong quantitative orientation to business.
From MIT, get on a virtual plane for the short flight to Stanford University in northern California. Keep your seat belts fastened though. The journey to Stanford Engineering Everywhere (SEE), the university's equivalent of OCW, is quite the BI open educational ride. If you're looking for a comprehensive year-long introduction to object-oriented programming, data structures, algorithms and paradigms in Java, C++ and other languages, you've found the place.
I was even more intrigued by the Artificial Intelligence/Machine Learning course offered by celebrated professor, Andrew Ng. ML is tough stuff and not for slackers, but for those willing to make a significant work investment, this class could provide a transformative educational experience. The prerequisites don't seem to be set in stone, though I'm not sure how a student could survive without a strong background in math, probability and statistics, and programming. The agenda is very ambitious, covering a multitude of statistical and machine learning techniques, including supervised, unsupervised and reinforcement learning models.
The what of ML includes a dizzying array of different machine learning models useful for BI. The course covers multiple and logistic regression, general linear models, discriminant analysis, multinomial models, smoothers, ridge and Softmax regressions, non-linear classifiers, neural networks, support vector machines, Bayesian models, k-means clustering, text clustering, principal components, independent components, and factor analysis -- among others! The models themselves are just a starter. Students hdemonstrate command of model mathematics, optimization methods and validation/tuning/diagnostics in challenging homework assignments using the mathematical language MATLAB. These exercises showcase the how of machine learning. Finally, students deploy techniques learned in the class for a paper that applies the methods to a problem area in their own research -- demonstrating the why of ML. Science and engineering students often use what they learn in ML to analyze data for their theses and dissertations. The same techniques are, of course, ideal for business applications.
All the pieces – lectures, notes, handouts, review documents, assignments and tests – are in place for enterprising “open” BI candidates to learn the same ML material as advanced Stanford students, without leaving their notebook computers. Even if interested analysts cannot commit to all the work, “auditing” ML to learn what and why by listening to professor Ng's lectures alone is still of value. By the way, if the ML materials aren't enough to jump start your MATLAB learning for the challenging class assignments, take a quick virtual trip back to MIT!