Bias in BI
A few weeks ago, I wrote on the U.S News Best Colleges for 2011. One of my observations was that the rankings seemed to favor small, highly-selective and well-endowed private schools over larger public state universities. Indeed for 2011, no state school cracked the top 20 national universities, and only two, Berkeley and UCLA, made the top 25.
Not to be outdone, the Wall Street Journal just published its top schools as rated by corporate recruiters, but with very different findings. Of the top 25 schools in their rankings, all but six are public, state universities, and it seems those with the very largest enrollments, such as Penn State, Texas A&M, Illinois, Purdue, Arizona State and Ohio State, are disproportionately represented. What gives? Are these rankings in any way “biased?"
My read of the methodology behind the WSJ survey suggests the design itself is in large part responsible for the WSJ's surprising findings. For this study, 842 recruiting managers from many of the largest public, private and not-for-profit employers were surveyed. Of the 842, 479 or 57 percent, responded, indicating a total of 43,000 college hires for 2009.
Recruiters were asked to name, in rank order, their top schools overall and their top schools by study major from a final list of eight. Respondents could only rank schools and majors from which they actively recruit. The ranked majors included Accounting, Marketing, Engineering, Business/Economics, Finance, Computer Science, MIS and Liberal Arts.
“To calculate the final ranking we did the following: First we assigned 10 points to each No. 1 ranking, 9 points to each No. 2 ranking, 8 points to each No. 3 ranking — and so on — for each school. For the overall ranking, those ratings were weighted by the number of total graduates that a company reported hiring in the prior year ... For the ranking of schools by major, those ratings were weighted by the number of graduates each company hired in that specific major. To be considered for the majors ranking, a school had to have at least seven companies rank it; most had more.”
Just as the U.S. News ratings seem predisposed towards private schools, the WSJ rankings, weighted by total hires of each school mentioned by recruiters, clearly favor larger institutions with hefty graduating classes. And with a business, technology and engineering focus, it's not surprising that state and engineering schools fare so well in contrast to elite arts and sciences institutions, many of whom don't even offer undergraduate business degrees. Perhaps combining the U.S News and WSJ ratings would cancel out their respective biases!
There are a number of meanings of bias pertinent for BI. The most prominent and the one generally referenced in BI research is prejudice, wherein findings are inclined to specific outcomes by design from the get-go. Indeed, the whole purpose of some “research” is to show a pre-ordained result. An example of this is a report based primarily on responses from a vendor's customers that show's their product in a market leadership position. Go figure. When BI research claims its “unbiasedness,” it generally means the results are not prejudiced by design – i.e. is not simply marketing hubris. That's a start but not enough.
Statisticians have their own definitions of bias. An unbiased estimate is one that, on average, hits the mark. A biased estimate, in contrast, systematically over or under shoots the true population value over the long haul. I well remember best, linear, unbiased estimates (BLUE) from grad school days many years ago. In the past, statisticians were obsessed with unbiased estimation, but now are willing to tolerate small amounts of bias for less estimate variability – the so-called bias/variance tradeoff. Better to be approximately right with certainty.
Research surveys are biased to the extent that the profile of sample respondents differs systematically from the underlying population of interest. Variations of random sampling to select survey respondents can go a long way to minimizing this bias, as illustrated by modern political polling techniques. This is in part because with randomization, samples don't choose themselves. Alas, BI surveys are generally completed by voluntary, self-selecting web respondents – and therefore often radically different from the population they're supposed to represent. In short, a biased sample.
The overall design or methodology of BI performance measurement might also be biased, especially if random assignment to the strategic comparison groupings isn't feasible. Natural groups often differ systematically out of the gate, making intervention comparisons problematic. For example, a market for the pilot test of a new product might differ on socio-economic measures from an existing comparison market. Are differences in pre-post measurement between the groups then due to the pilot product? Or rather differences in the markets? Random assignment to the comparison groups minimizes the threat of the latter.
Most BI surveys/research go to great pains to establish their “unbiasedness.” And while I don't doubt the researchers' sincerity, I think the term is used to connote the absence of prejudice rather than the more precise statistical definitions. Indeed, the methods used today to elicit survey responses almost ensure methodological bias to BI survey research. And without random assignment to the competing groups of performance measurement, it's pretty hard for BI analysts to refute alternative explanations for their findings.
Fortunately, that BI research is almost certainly biased isn't necessarily devastating. Often, large sample sizes provide protection from its complications. But to mitigate the potential damage of bias, researchers must recognize and acknowledge the problems built into their designs and also investigate the consequences. What randomization and other fastidious methods buy is the comfort of knowing that factors outside the control of the research that could influence results are at least somewhat neutralized.
With self-selecting, voluntary response surveys, however, certain geographies, products, sectors, company sizes, BI maturity levels, etc. might be disproportionately represented, potentially skewing the findings. At a minimum, researchers should tabulate those extraneous variables, comparing their distributions to what is known of the population. The more the sample distributions of these factors look like those of the population, the more comfort the researcher can have that her results aren't spurious. On the other hand, if respondents differ from the population on outside factors in meaningful ways, the researcher must test if those differences are responsible for the results. In some instances, adjustments can be made to compensate for the impact of bias.
Going forward, it's my hope that consumers of BI research and performance measurement make the search and handling of bias a driving consideration in their evaluation of quality.