I was recently turned on to two interesting Wired opinion articles from one of my LinkedIn groups. I’ll share thoughts on the first today and the second a few weeks out.
I guess I shouldn’t be surprised that Samuel Arbesman, author of “Stop Hyping Big Data and Start Paying Attention to ‘Long Data,” is a fellow at the Institute for Quantitative Social Science at Harvard. I’ve mentioned the IQSS as a breeding ground for data scientists a dozen times in my IM blogs over the years.
Arbesman’s thesis is that while big data is a powerful lens into the workings of human behavior, most such analyses are limited by being just a snapshot in time. What’s missing is the time dimension. “Many of the things that affect us today and will affect us tomorrow have changed slowly over time: sometimes over the course of a single lifetime, and sometimes over generations or even eons. Data sets of long timescales not only help us understand how the world is changing, but how we, as humans, are changing it — without this awareness, we fall victim to shifting baseline syndrome.”
The implication? Long or time series data can often lift analytics arguments from inadequate to compelling.
Like Arbesman, top data scientists are concerned with the validity of their analyses, using guides such as “Experimental and Quasi-Experimental Designs for Generalized Causal Inference.” They obsess on both internal validity, which has to do with assuring that study interventions actually cause the hypothesized effects and external validity, which relates to generalizing findings over different settings and population samples.
Scientists consider four distinct elements of research designs that impact validity: 1) assignment, 2) measurement, 3) comparison groups, and 4) treatments. In the absence of true random “assignment” to experimental interventions, the other three elements become ever more critical for strengthening the analysis.
In such non-experimental or observational studies prevalent in business, time series designs can play an especially important role for data scientists. The combination of multiple points of measurement both pre and post intervention and a comparison group that’s not randomly assigned can help eliminate sources of statistical bias.
Consider the simple one-group, pretest-postest non-experimental design comprised of a pre-measurement (O11), an intervention (X), and a post-measurement (O21), with timeline:
1) O11 X O21
While the pre-measurement in this weak scheme does say something about what might have happened without the intervention – the counterfactual, the difference between O21 and O11 could easily be due to factors other than the intervention, such as history or maturation.
If we add more pre and post measurements to 1), building the “interrupted time series” of 2), the potential biases of maturation and history, while still present, are at least somewhat mitigated, especially if O21-O13 is greater than differences such as O13-O12 or O22-O21. The additional observations enhance the validity of this design.
2) O11 O12 O13 X O21 O22 O23 O24
When there are scores of pre and post measurements surrounding the intervention, it makes sense to statistically evaluate both the intercept and slope of the time series, the existence of each in turn saying something about the effects of treatment.
An even more sophisticated design adds a “non-randomized”, no-treatment comparison group to the interrupted time series in 2):
3) O11 O12 O13 X O21 O22 O23 O24
O11 O12 O13 O21 O22 O23 O24
Design 3) adds further protection from history as a threat to study validity, since treatment and control should behave similarly in the absence of an intervention effect.
A final design to consider with time series data is called multiple replications, in which the intervention is introduced, removed, re-introduced, removed, etc. Such a scheme might look like the following:
4) O11 O12 X O21 O22 !X O31 O32 X O41 O42
where X is the introduction of treatment and !X is its removal. While this design might be difficult to implement, it can be powerful. “A treatment effect is suggested if the dependent variable responds similarly each time the treatment is introduced and removed, with the direction of responses being different for the introductions compared to the removals.”
There’s no shortage of interrupted time series and other quasi-experimental designs to help assure the validity of analysis when randomized experiments aren’t feasible. Data scientists are well advised to use such methods enthusiastically to support their investigations.
Those unconvinced of the need to consider time in the analysis of data should check out this provocative visualization.