I recently attended The Data Warehousing Institute (TDWI) spring conference in Boston. It was a great trip. Oh, the meeting was fine - I learned some new tricks and met some old friends. An unexpected highlight for me, however, was the fantastic weather and a city alive with spring - three straight days of 75-80 degrees and bright sun in Boston the middle of May. With the beautiful weather and some time on my hands, I was able to take advantage of Boston's well-deserved reputation as a great walking city. Several times, I made the trip up Massachusetts Avenue across the Harvard Bridge to Cambridge - to the heart of the MIT campus. With joggers, bicyclists and students everywhere, I almost (but not quite) wished I were back in school.
MIT, of course, is among the leading universities in the world, a nonpareil science, engineering and research institution. Flush with excitement from spring and a vibrant college community, I used some of my extra time to browse the MIT Web site (http://mit.edu/), intent on becoming acquainted with some of MIT's latest scientific and technological research. Instead, I discovered OpenCourseWare from the MIT home page and have been figuratively back in school - without the pressure of grades - ever since.
OpenCourseWare (OCW) is described on the MIT Web site (http://ocw.mit.edu/index.html) as "a free and open educational resource for educators, students and self-learners around the world." Long a noteworthy open source leader, innovator and contributor, MIT has raised the open stakes by publishing MIT course materials, making them generally available to the public. And though other U.S. universities such as Harvard, Johns Hopkins and Notre Dame belong to the OCW Consortium, none has made the comprehensive commitment of MIT.
At its core, MIT OCW is an electronic publishing model for educational materials enabled by Internet technologies. The idea behind MIT OCW is to make courses used in most undergraduate and graduate subjects taught at MIT available on the Web, free of charge, to any user, anywhere. There is no registration, no degrees or certificates and no access to MIT faculty. At this point there are more than 1,550 courses available in the MIT OCW, spanning all university departments.
MIT OCW courses are organized by department and by undergraduate/graduate status. Materials available for specific courses vary, but typically include syllabi, reading assignments, written assignments, study materials and exams. The more comprehensive include lecture notes, generally in the form of slide decks organized in PDFs. For stats/analytics courses, data sets are often available. Components may be downloaded individually, while the entire "course" is available as a zip file.
OpenCourseWare for BI
Many of the courses highlighted in the MIT OCW offer significant value for BI analysts. Not surprisingly, courses offered under the Sloan School of Management are particularly relevant. A few are noted here, but many more pertaining to technology, strategy, systems and process optimization are available to support disparate BI uses. My experience is that those with lecture notes offer the most immediate value. What follows is a small sampling of courses that roused my interest.
Communicating with Data (http://ocw.mit.edu/OcwWeb/Sloan-School-of-Management/15-063Communicating-With-DataSummer2003/CourseHome/index.htm) is a conceptual graduate-level survey concerned with "quantitative techniques as a way of thinking, not just a way of calculating, in order to enhance decision-making skills." (See Figure 1.) The point of departure for this course is decision analysis, which uses decision trees surrounding GOOP (goals, options, outcomes, probabilities) as a framework for structuring decisions. Unlike a pure stats course, Communicating with Data touches many quantitative and behavioral decision disciplines including probability, portfolio and risk management, utility theory, simulation and Monte Carlo analysis, and individual value - and thus serves both as a broad introduction to BI and analytic thinking and as a foundation for more advanced analytics work.
Applied Statistics (http://ocw.mit.edu/OcwWeb/Sloan-School-of-Management/15-075Applied-StatisticsSpring2003/CourseHome/index.htm) is an elementary/intermediate statistics course for undergraduates (see Figure 2). The lecture notes for this course are nothing short of fabulous. BI analysts can learn much from just the first two lectures on collecting data and summarizing/exploring data. Indeed, the summarizing/exploring data notes should be required reading for aspiring BI practitioners. This lecture pays homage to John Tukey and the Exploratory Data Analysis (EDA) movement in statistics that emphasizes data distribution and visualization in contrast to "the mathematization of statistics." Basic summarization graphs, such as stem and leaf, box and whiskers, scatter plot matrices and quantile plots, are highlighted as simple but invaluable aids to understanding data. All statistical/graphical illustrations as well as homework assignments are completed using the R/S-Plus family of statistical packages.
Data Mining (http://ocw.mit.edu/OcwWeb/Sloan-School-of-Management/15-062Data-MiningSpring2003/CourseHome/index.htm) is a personal favorite. The lecture notes, assignments and data sets for this undergraduate/graduate course are all quite valuable, even for those who opt not to use the Excel add-in recommended as the mining tool of choice for assignments. The focus is on applications of data mining technology to solving business problems rather than underlying mathematical and computer science theory. With data mining defined as "statistics at scale, speed and simplicity" lecture notes cover all approaches, including statistics, machine learning, database retrieval and hybrid. I particularly like the discussions of classification trees, regression trees, logistic regression, principal components, discriminant analysis, neural nets and k-means clustering - all techniques that I've used in practice. The business applications discussed with the methods bring the techniques to life. The lecture on association rules (market basket analysis) references work of Jiawei Han at the University of Illinois. (The OpenBI Forum will present a series of columns on data mining with Professor Han later in the summer.) Finally, the assignments and accompanying data sets are very valuable for investigating different mining approaches. I was readily able to access the data sets and complete the assignments with R and S-Plus. My one quibble with this course is the absence of notes for the last two lectures on collaborative filtering and the practice of data mining, which I'm sure are quite pertinent for practitioners.
The MIT social science departments such as political science, economics and urban studies also offer value for BI.
Quantitative Research Methods: Multivariate (http://ocw.mit.edu/OcwWeb/Political-Science/17-874Spring2004/CourseHome/index.htm) is a graduate-level course that focuses on statistics and design surrounding political research. Though much of the theoretical material on linear/nonlinear models and regression is not unique to political science, the applications of the techniques use examples from voting behavior and legislative representative studies to illustrate the approaches. Of course, the same techniques developed in this course are pertinent to the prediction and evaluation of business performance and are thus quite germane for BI analysts. The lecture on panel models hits with substantial rigor on several themes noted in the last two columns of the OpenBI Forum. Indeed, as we surmised in "Validity, Design and BI, Part 3," panel designs are near the top of the evidence hierarchy, able to withstand methodological challenges of selection, history and regression to the mean. The business community is well advised to borrow liberally from such methods to address similar challenges for BI.
Putting Social Sciences to the Test: Field Experiments in Economics (http://ocw.mit.edu/OcwWeb/Economics/14-11Spring-2006/CourseHome/index.htm) is an undergraduate course focused on the up-and-coming field of experimental economics. When I was in school many years ago, economists theorized on how we ought to behave as rational actors; now, at last, they are starting to pay attention to how people actually behave and ways in which incentives/disincentives can be used to change behaviors. Each lecture focuses on a different social issue such as public health, education, housing and voting - and the use of incentives in randomized treatments to determine effects on behavior. These experiments are generally a step up in the evidence hierarchy from observational studies, able to account for the methodological confounding noted above. Increasingly, businesses in the Internet age look to "prove" their strategies through similar rigorous field experiments that can borrow heavily from experimental economics. The lectures offer sound insights on how to devise experiments with appropriate incentive treatments.
A Workshop on Geographic Information Systems (http://ocw.mit.edu/OcwWeb/Urban-Studies-and-Planning/11-520Fall-2005/CourseHome/index.htm) is a graduate-level urban studies and planning course that details the workings of GIS. The focus is comprehensive, covering the technical underpinnings of GIS and GIS software as well as data gathering, analysis and presentation, and the evaluation of prospective GIS applications. Lectures cover GIS principles and methods, relational database management and data models, geospatial databases, geocoding and network analysis methods and Internet GIS/ArcIMS. At the conclusion to what is certainly an exhaustive learning experience, students should be well versed in the workings of GIS for intelligence. A special discussion group established for this course offers educators, students and self-learners the opportunity to interact with others over class materials.
As noted by MIT President Susan Hockfield: "OpenCourseWare expresses in an immediate and far-reaching way MIT's goal of advancing education around the world. Through MIT OCW, educators and students everywhere can benefit from the academic activities of our faculty and join a global learning community in which knowledge and ideas are shared openly and freely for the benefit of all."
I've been using OCW for about a month and am a big fan. Having open access to the teaching wisdom of some of the finest minds in the world is certainly a boon for those committed to life-long learning. And having open access to the premier academic foundations of business intelligence, methodology and performance is a potential competitive advantage that should not be overlooked by the BI community.