I almost couldn't get myself to purchase a book called Super Crunchers. The thinking was that such a title is more indicative of a Fox News Channel special investigation piece than a book on business intelligence (BI). Rupert Murdoch the new BI guru? It just doesn't seem right. Fortunately, you can't always judge a book by its title. Turns out there's a lot to like about Super Crunchers - and lots of lessons for BI, including several noted in previous OpenBI Forums. Turns out as well that the author eats his own dog food, with even the Foxian title the result of "super crunching."If Competing on Analytics offers a solid foundation and conceptual framework for analytics in business and Smart (Enough) Systems provides painstaking detail on how to use analytics, decision rules and optimization techniques to automate routine decisions, Super Crunchers provides breadth of exposure to the current state of analytics in business, government, education and health care, offering incisive commentary on both developments and direction. Indeed, a dilemma for reviewers of Super Crunchers is deciding just what to exclude from consideration.

The Days of Wine and Numbers

A law professor and econometrician at Yale, Super Crunchers author Ian Ayres begins the discussion of predictive analytics with, appropriately enough, a modeling "application" developed by an economist that illustrates many of the issues surrounding super crunching. Orley Ashenfelter, Princeton professor and wine aficionado looking to optimize his selection of Bordeaux for purchase, developed a regression model to determine characteristics associated with vintage auction prices. His final equation reduced to three factors related to the growing season in France: winter rainfall, average temperature and harvest rainfall. The beauty - and the source of controversy - with this approach is that it allows Ashenfelter to predict the future quality of a vintage year before the wine is sold. And his predictions, published in a semiannual newsletter, have proven quite accurate - thus roiling the wine critics' community. One expert decried Ashenfelter's work as "rather like a movie critic who never goes to see the movie but tells you how good it is based on the actors and director." Fortunately, many who at first scoff at such unsophistication grudgingly relent to co-existence in time. The theme of predictive models versus resistant expert opinion or intuition is a recurring one in Super Crunchers.

The models of Orley Ashenfelter are now legend in college econometrics and social science statistics courses, and when subsequent examples in Super Crunchers included the usual competing-on-analytics suspects Netflix, Harrahs, Amazon and baseball's Oakland A's and Boston Red Sox, I began to fret that Super Crunchers was just a me-too analytics-in-business book. Not to worry. The supply of excellent examples seems almost endless.

The wine expert's analogy of Orley Ashenfelter to a movie critic who never watches films is eerily prophetic of the mission of start-up company Epagogix to provide analytic services to the film industry. Epagogix purports to predict the gross revenues of proposed motion pictures before filming has started, using scripts only and no indications of actors or directors. With a foundation of neural net data mining technology, models developed not only predict which films are not worth making, but also provide insight for tweaking to optimize financial intake. The models have done a pretty decent job back-testing the relationship of film characteristics with revenues. One consistent finding: film stars aren't worth their cost. The Epagogix CEO projects that major studios that would adopt the model's film-making discipline would net an additional billion dollars per year. Not surprisingly, film executives and staff aren't enthusiastic about the Epagogix story. "Nobody, nobody - not now, not ever - knows the least goddamn thing about what is or isn't going to work at the box office," huffed screenwriter William Goldman.

Random Analytics

Super Crunchers differentiates from most of the current literature on business analytics in its obsession with randomization to test alternative hypotheses/strategies/decisions. OpenBI Forum readers hear this ad nauseaum http://dmreview.com/article_sub.cfm?articleId=1081924, but random assignment to various treatment groups helps assure that groups are "equal" on factors other than the treatments, thus enabling valid interpretation of findings. Super Crunchers provides several good illustrations of the value of randomization in the financial services industry with CapOne and Credit Indemnity, but I was most intrigued by Offermatica, who's built a business on a "testing-driven approach to Web site design enables marketers to optimize their site's content and offers, to maximize their site's overall profitability." In one illustration, two alternative graphic designs were contrasted for ultimate consumer spend - and one was declared a clear revenue winner. With Offermatica, it's testing and analytics versus the opinions of Web usability experts, and the recommendations are often contradictory - and probably contentious. Super Crunchers wonders why randomization techniques are not even more pervasive in business. My guess is randomization and experimentation might in fact be more common than is generally acknowledged, especially for e-commerce companies. Such firms, though, may wish to keep their testing strategies confidential.

Do Government Programs Work?

Super Crunchers hits its stride with discussion of the effectiveness of government programs, an area of much academic attention over the years, starting with the work of the late Senator (and former professor) Daniel Patrick Moynihan 40 years ago. Indeed, it's not a stretch to say that super crunching for business owes a debt of gratitude to the mandates of Great Society programs to rigorously evaluate their efficacy. A new wave of experimental social scientists is paving the way for the next generation of policy initiatives. Esther Duflo, an economist with the Poverty Action lab at MIT, is co-instructor of the Putting Social Studies to the Test OpenCourseWare curricula reviewed by the OpenBI Forum in June http://dmreview.com/article_sub.cfm?articleId=1087121 .

There is no shortage of solid government programs that use both sophisticated designs and analytics to prove their value. A good illustration chronicled by Super Crunchers is the Progressa Program for Education Health and Nutrition to alleviate poverty in Mexico. The idea behind Progresa is to provide a conditional transfer of cash to poor families that exhibit desired behaviors like assuring school attendance for children, observing nutrition protocols and maintaining proper levels of prenatal care. A 506 village, 24,000 household randomized test was conducted in 1997 comparing Progressa-designated villages to controls. The results of the experiment were both provocative and gratifying. Progressa household children attended school much more regularly than the controls, and there was a 12 percent reduction in serious illness for Progressa family children. The pilot was deemed so successful, the next political regime implemented a new, more sweeping program several years later.

The Freaks

The 2005 publication of Freakonomics championed by Steven D. Levitt has confirmed legitimacy to a new category of social scientists - those who use the excellent analytical tools of the social science disciplines, but are just marginally interested in the questions traditionally asked by academics. In the past, such rogue behavior would have been discouraged. Today, fortunately, more latitude is given young scholars. Social science is all about human behavior, about explaining how people get what they want - so potentially many additional areas of inquiry are accessible to academic analysts.

Super Crunchers discusses an analysis that looks at point shaving corruption in college basketball by examining actual game winning margins versus those predicted by the gambling point spread. For all games over the period in question, the distribution was a spot on bell curve, with 50 percent of favorites beating the spread. When only games where the spread was greater than 12 points were considered, just 47 percent of favored teams beat the spread. When the scores of those games are further examined, however, the favorite team was tracking to win 50 percent of the time with less than five minutes remaining. This freak analysis provides initial circumstantial evidence of potential point shaving in games - where the favorite lets up (but still wins) with the game in hand.

Ian Ayres and Steve Levitt teamed up to research the impact of the LoJack vehicle-recovery device for deterring car theft. With LoJack in place, stolen vehicles are not only recovered most of the time, but the thieves are often caught. The hypothesis posited by Ayres and Levitt is that if enough car owners use the LoJack vehicle recovery device, the overall incidence of theft, even for cars without the device, would decline. And if that proved out, then auto insurers might be persuaded to offer discounts to LoJack users. An examination of data for 14 years in 56 cities indicated a $500 investment in LoJack cut car theft expense by $5,000 - thus providing the basis for substantial owner discounts.

Caveat Emptor

Super crunching is now de rigueur in the business world, but that world is not a utopia, and Super Crunchers isn't afraid to offer commentary. In 2007, "best" customers are often not those who buy the most product, but rather those who are most profitable. And many profitable customers are paying more than they should. So preferred customers, especially in travel, leisure, telecom and financial services may, in many cases, be getting ripped off. An even more insidious example offered by Super Crunchers is the specter of a gaming company like Harrah's managing the addiction of its "Total Rewards" customers by computerizing their every move, helping to "cool the marks off" as losses mount.

Super crunching can just as easily be used for shady practices as it can for legitimate business. Analytics for discrimination is certainly nothing new. And proxy discrimination, where the attribute in question, say race, is itself predicted from other variables and used as a subsequent predictor is easy to deploy. What if the analytic models are mistaken or based on bad data or invalid study designs? What about the potential invasions of privacy by data aggregators such as ChoicePoint or marketing intelligence firms like Axciom - or data everything Google?

As the use of super crunching in business continues to grow, we are certain to see attempts to level the analytics playing field between sellers and buyers. The widespread use of consumer-oriented counter analytics Web sites such as farecast.com and zillow.com, which help buyers by sharing price information for air travel and real estate respectively, is certainly a good first step.

Experts versus Analytics

There'd probably be little argument from even die-hard nonbelievers on the continued growth of analytics and super crunching in the business world. The technology's there, the methods are there, demand is there, the business platform is there and, most important, the results are there. The trajectory might not be quite as steep as first envisioned - it rarely is - but super crunching is here to stay - and prosper. What then are the roles of experts and analytics in the super crunching world?

Super Crunchers advocates that experts hypothesize/theorize while analytics test/evaluate. Experts provide diagnostic thinking and analysis - asking the right questions, ultimately developing theories of cause and effect. They determine what factors are relevant (and irrelevant), casting them in theory form: "if factor A then conclusion B" or "the more of factor A, the less of conclusion B." The theories are then tested by the tools of super crunching - with feedback ultimately finding its way into subsequent iterations. Perhaps the new breed of "freak" social scientists, quantitative and intuitive, able to empathize with experts but also facile with analytics, can help lead the way.

Super Crunchers Assessed

Quantitative colleagues with whom I've discussed Super Crunchers have raised several objections to the book. The first is that the analyses are not rigorous enough, the author rather cavalierly detailing normal curves, standard deviations, randomization, regression analysis, neural net data mining, etc. I don't much subscribe to that argument. True, Ayres at times takes a few liberties with analytic rigor (I winced when I read his 95 percent rule for stock portfolio returns, for example.), but I'm sure he didn't set out to write an American Statistical Association refereed treatise. Rather, Super Crunchers is intended to educate an intelligent public with a breadth of illustrations from many disciplines on how analytics are changing the way we live and conduct business.

The second critique is that Ayres is best in the academic worlds of economics, political science, education and "Freakonomics"- with business applications secondary. I certainly agree that Super Crunchers does a great job discussing research studies, but most of the business analyses hit the mark as well, even if not with quite the same authority. I guess I'm more than willing to pay a small price of familiarity for the larger ambition of Super Crunchers. I also think that adding Ayres' brand of quantitative social science methodology to the for-profit world is a major boon for business - offering new eyes and approaches to analyzing business problems that are now strategically critical. As Ayres notes, "this book is about the leakage of social science methods to the world of on-the-ground decision-making." Internet juggernauts Google, Amazon, Yahoo, eBay and Microsoft apparently agree with Ayres, leading the business pack in hiring advanced-degreed quantitative social scientists from top universities.

Overall, I feel that Super Crunchers is an outstanding contribution to business intelligence that should find the short list of must-read books in 2007 for serious BI practitioners and business strategists. The book offers a wealth of pertinent analytic applications from a variety of disciplines, also addressing some of the tougher imminent concerns such as privacy, ethics, and the co-habitation of analytics and expertise. Perhaps like Orley Ashenfelter's predictive models, Super Crunchers will incite the critics at first, ultimately proving its mettle with accurate predictions over time.

References:

  1. Ian Ayres. Super Crunchers - Why Thinking-By-Numbers is the New Way to Be Smart. Bantam Books. 2007.
  2. James Taylor and Neil Raden. Smart (Enough) Systems - How to Deliver Competitive Advantage by Automating the Decisions Hidden in Your Business. Prentice Hall. 2007.
  3. Thomas H. Davenport and Jeanne G. Harris. Competing on Analytics - The New Science of Winning. Harvard Business School Press. 2007.
  4. Steven D. Levitt and Stephen J. Dubner. Freakonomics - A Rogue Economist Explores the Hidden Side of Everything. HarperCollins. 2005.