Much as I love the behavioral economics gospel espoused in Dan Ariely's delightful book, Predictably Irrational, I'm less than comfortable with the driving methodology of controlled classroom or laboratory experiments that use MIT business graduate students as subjects. I understand why Ariely uses lab experiments; in his own words: “They help us slow human behavior to a frame-by-frame narration of events, isolate individual forces, and examine those forces carefully and in more detail. They let us test directly and unambiguously what makes us tick.” I'm on board with the experiment part; it's the lab that causes me heartburn.

Indeed, I'm as big a fan as anyone of randomized experiments for business. Where appropriate, experiments are often the best way to isolate/manipulate factors of interest, with reasonable assurance from probability theory that potentially confounding outside variables are “equal” between experimental and control groups – and will subsequently not “queer” findings. Estimates derived from random experiments should therefore be unbiased, ultimately honing in on “correct” population values.

What I'm less enamored with are artificial laboratory settings that drive basic behavioral research – and the relevance of those settings to applied business. Will irrational behaviors manifest in contrived classroom settings with highly intelligent 30 year old MBA students generalize to complex business environments and the population at large? I'm unconvinced. A compromise that's particularly appealing, however,  is gaining traction in both business and behavioral science worlds. Field experiments combine the validity protection of randomization with the realism of natural business/behavioral environments.

For economists David Reiley and John List, field experiments occupy an important middle ground between laboratory experiments and naturally-occurring or observational data. The idea behind field experiments is to make use of randomization in an environment that captures important characteristics of the real world. Field experiments permit the researcher to create exogenous (researcher-controlled) variation in the variables of interest, thus allowing her to establish causality rather than mere correlation. In contrast to laboratory experiments, field experiments potentially relinquish some control in exchange for enhanced realism.
As an illustration, Reiley and List cite a retail merchant pricing inquiry in which catalogs promoting a new cotton dress were sent to a sample of prospects, specifically to test the effectiveness of a price that ends in $9 – the so-called nine-dollar price. The mailings offered the same dress at randomly-selected pricings of $34, $39, and $44.  Investigators found  “a positive effect of a nine-dollar price on quantity demanded, large enough that a price of 39 dollars actually produced higher quantities than a price of 34 dollars.” Ah, predictable irrationality in the real world that serves business well.
Arguing that field experimentation is under-provisioned, Donald Green and Alan Gerber take a jaundiced view of the state of research methods in political science. “Each new development in data analysis, sampling theory, and computing seemed to make non-experimental research more promising and experimentation less so.” Green and Gerber essentially accuse the political science mainstream of opting to be grand “planners” rather than practical “searchers”: “The narrow purview of experiments also ran afoul of the grand ambitions that animated the behavioral revolution in social science.” The authors cite their productive collaboration with a political consulting firm that agreed to randomize messaging sent to voters, and who acknowledged they hadn't the: “slightest idea whether the most efficient use of their budget is to send four mailers to fellow partisans, nine mailers to a small set of ardent partisan supporters, or two mailers to everyone.” 
Under what circumstances are field experiments for business likely to be most productive? Probably the most important predictor is an organizational searching mentality, where analysts acknowledge they don't have all the answers – but resolve to get them. These searchers are obsessed with evaluating the performance of their strategies, and what better way than to use field experiments to test alternatives. With government and education programs on the rise, rigorous program evaluation will more and more be mandated, and field experiments sit at the top of methodology rigor charts.
Organizations that have control over their business transaction units can productively deploy field experiments. Marketing campaigns for retailers and catalogers routinely benefit from experiments as does user experience with large web sites like Google, Yahoo, and Amazon. Financial services providers that deliver customizable products such as credit cards and checking accounts can use field experiments to optimize their customer and product mixes.
Green and Gerber note that decentralization is the field experimenter's ally. Change to be introduced gradually over time and geography offers a fertile platform for field experiments. Pilot testing of new products and geographies are cases in point, as is phased implementation. Next year's test group can be this year's control.
Finally, situations for which supply exceeds demand or where innovation cannot be delivered all at once beg for field testing. Hyper-interest in charter schools, stimulus-related jobs programs and pilot health care programs for the uninsured are illustrations. With more applicants than program slots, a random or lottery assignment mechanism might be the fairest approach to allocation. You can be certain that whatever new health care initiatives eventually make it through Congress will be rigorously evaluated, at least partially through comprehensive field experiments.
A follow-up blog will examine several field experiments from public administration and the social sciences, noting their relevance for business and BI.