I must admit I'm obsessed with BI designs for business performance measurement. I guess I'm jaundiced, having seen too many naive, single group, pretest-treatment-posttest analyses offered as “proof” of a new initiative's success, when the results can just as easily be explained by other factors. As I continued my search for answers to BI design, a Google of “research designs” landed me on the very informative web page of a graduate course at North Carolina State University taught by G. David Garson, entitled Quantitative Research for Public Adminstration

For Garson, research designs are characterized as either experimental or quasi-experimental, depending on whether subjects are randomly assigned to treatment and control groups. The randomization deployed in experimental designs "goes a long way toward controlling for variables which are not included in the study", while quasi-experimental designs have to "control for confounding variables explicitly through statistical techniques." Garson proceeds to provide a comprehensible-for-BI taxonomy of experimental and quasi-experimental designs borrowing from the classic Quasi-Experimentation: Design and Analysis Issues for Field Settings, and its updated cousin Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Armed with the powerful designs for analytics presented in these sources, BI practitioners can make stronger cases for what's really happening when they measure and attribute business performance. The remainder of this column focuses on experimental designs pertinent for business; a subsequent article will look at quasi-experimental designs.

The simplest experimental designs for business are between subjects, wherein the BI analyst is comparing individuals who experience different interventions – e.g. credit card prospects who are randomly exposed to Internet promotions offering different levels of interest rate, annual fee and loyalty program factor options. A given prospect is exposed to only one combination of levels of these factors, and comparisons are then made across subjects. In a factorial design, each level of every factor is represented, with equal numbers of respondents in each of the factor/level categories.

For within subject or repeated measure designs, on the other hand, subjects are measured multiple times for their factor/level categories, and serve as their own “controls”. An example of within subject for business would be a panel study by a marketing research firm where individuals are surveyed repeatedly for purchasing patterns and opinions over time. A close kin of within subject is the matched pair design, whereby the repeated measures are not of the same individuals, but rather different subjects matched on key attributes. Within and matched designs have different strengths and weaknesses important for BI analysts to consider.

In addition to these “classical” statistical designs, there are ingenious testing approaches pertinent for business field settings that use randomization but also recognize the scarcity of of available intervention slots. Where demand exceeds supply, a lottery is often a socially acceptable and methodologically sound design. Lotteries are used to assign applicants to magnet schools and can serve business as well for soliciting and allocating participants to attractive pilot programs where there are more candidates than positions. Relatedly, a waiting list design acknowledges that an intervention cannot be given to all interested parties. The waiting list, of course, serves as a control group for those fortunate enough to be accepted for treatment.

The equivalent group time series design can be deployed when the intervention must be temporarily rationed. A company's new training program, for example, cannot be delivered to all employees simultaneously, but only to different groups in sequence, thus allowing multiple curricula to be implemented and effectiveness contrasted. Businesses often use the spatial separation design to pilot new products or concepts in geographically dispersed areas where treatment-relevant communication cannot contaminate findings. And finally, two designs, mandated change/unknown solution and indifference curve, can be used when there is no clear program “winner” to contrast with control. With mandated change, many candidate treatments are compared with a do-nothing control. With indifference curve, the attractiveness of a program is adjusted so that participants are indifferent, and subsequently randomly assigned to treatment or control. These latter designs are particularly useful in “nudge” behavior investigations consistent with the libertarian paternalism of Nudge: Improving Decisions About Health, Wealth, and Happiness.