More Time Series Interruptus
I received an email from a reader after my last blog on interrupted time series asking for additional references on ITS techniques. The Experimental and Quasi-Experimental Designs for Generalized Causal Inference book noted in that article is excellent, as is the timeless Campbell and Stanley classic, Experimental and Quasi-Experimental Designs for Research, a staple of university social science methodology courses for generations.
Indeed there is much in common between social science methods and BI supporting the “science of business”. Both endeavors strive to “prove” the cause and effect of planned interventions using randomized experiments that, subject to probability limitations, can reasonably ensure that differences in treated and untreated groups are the result of the intervention and not confounding outside variables. Unfortunately, business, like the social sciences, is often not the ideal environment for true experimental methods, and in many cases must rely on less intrusive non-randomized or quasi-experimental designs. Humans and business are, alas, not crops and soil.
Educational psychologists are obsessed with designs and statistics, much like BI analysts. In fact, our current understanding of quasi-experimental designs comes substantially from their efforts to evaluate the effectiveness of educational interventions such as innovative teaching methods. As I searched additional references for my blog inquiry, I came across a good one from my stash of accumulated documents. And, as luck would have it, the paper's available for public download. Interrupted Time Series Quasi-Experiments, published in the mid 90's by Arizona State professor Gene Glass, is a comprehensive but accessible introduction to ITS that can serve as a reference for interested BI analysts.
According to Glass, “True experiments satisfy three conditions: the experimenter sets up two or more conditions whose effects are to be evaluated subsequently; persons or groups of persons are then assigned strictly at random, that is, by chance, to the conditions; the eventual differences between the conditions on the measure of effect (for example, the pupils' achievement) are compared with differences of chance or random magnitude.”
In contrast, with natural teaching settings and business environments, “You might often find that a time-series experiment is a workable alternative when the conditions of a true experiment can not be met...But the time-series experiment imposes its own requirements, the foremost of which is that it sometimes requires that data be recorded for many consecutive points in time before and after a treatment is introduced...The simple logic of the time-series experiment is this: if the graph of the dependent variable shows an abrupt shift in level or direction precisely at the point of intervention, then the intervention is a cause of the effect on the dependent variable.”
Glass proposes a wealth of ITS designs that can be deployed in BI settings, starting with the simple: O O O O O O X O O O O O O O, where O represents an observation or measurement, and X the intervention or treatment. He follows with the treatment/control ITS, the multiple intervention ITS, a sequential multiple group ITS, and multiple group reversal. Each of the designs addresses different threats to the validity of interpretation. And, unlike true experiments that involve randomization, “natural” ITS designs are often easy to deploy – even after the fact.
The author also provides a taxonomy of intervention effects that analysts should look for when analyzing a time series “experiment”. Of course, changes in measurement level following the intervention are first in priority. The author identifies four different types – abrupt, delayed, temporary and decaying – each of which provide different information on the treatment. In addition to levels, there are also changes in the direction of measurement, including abrupt, delayed, temporary and accelerated. Finally, there might be changes in the volatility of the series post-treatment, an example being the roller coaster rides of major stock indexes during the financial crisis.
How many observations are needed for a valid interpretation of ITS findings? According to Glass: “Few statisticians would insist on more than 100 points in time, but some of the more rigid ones might not accept fewer. Fifty time points (25 pre and 25 post) is a good round number, provided you make a definite hypothesis about the form of the intervention effect and stick with it; that is, provided you don't succumb to the temptation of fishing around in the data after you see it and taking second and third guesses at the form of the intervention effect.”
Glass demonstrates his ITS investigative prowess with an analyses of British automobile fatalities surrounding enactment of the British Road Safety Act of 1967. When examined as an aggregate time series, the fatality rate, seasonally adjusted and corrected for miles driven, appears to be trending down with no discernible changes in level or direction post-BRSA enactment. The author digs deeper into the data, however, examining a “sub-series” over late evenings and early mornings Saturday and Sunday, to evaluate the effectiveness of the BRSA's program of roadblocks for testing driver blood alcohol levels. The findings from this limited time frame were provocative, demonstrating a “more than 50%, reduction in fatalities...We can move from the equivocal results to the clear certainty, merely by separating the larger body of data.”
It's not a stretch to propose that evaluations like this with simple ITS designs using the data warehouse as the time series source can be an invaluable adjunct to the “science of business”.
Steve also blogs at Miller.OpenBI.com.