June 11, 2013 – The ever-expanding Predictive Analytics World conference series continues to bring smart strategies and business-minded discussions to the forward-thinking field, including insight this year from leaders at Google, Orbitz, the City of Chicago and others.
In our second year covering the event in person, Information-Management.com will post quotes and insight overheard from speakers and experts on hand at the conference headquarters at McCormick Place in Chicago.
8:01 a.m. CST: Should’ve asked for a predictive model on the likelihood of my train to Chicago being delayed by the vague and disconcerting “track problems.” Follow up a snarky mention of that with a search engine look at “predictive analytics” and “train schedules,” turning up a neat little piece on Union Pacific’s use of acoustic rail data to predict imminent track failures. Looks like I’ll be doing more searching and reading, and less grumbling, en route to the Windy City.
11:12 a.m.: Robert Lancaster of Orbitz Worldwide, the online travel pricing site, is diving right into Kaplan-Meier estimates and survival regression packages in R to a fairly full and attentive room. Lancaster says to determine changes of rates and other variables for his site, they load rake searches of data sources on their site in real-time into mongoDB. Then, that load is moved by the hour into Hadoop, and by the day into MapReduce, with a summary of these rates every three months. Here, Lancaster says Orbitz can accurately gauge dates when hotels will receive more traffic. Las Vegas, for example, fills up on weekends, but also more highly near the end of the year, while Chicago is more of a “working” market for hotels, with room rates fairly high during the weekday. From this, Lancaster and his team also create predictive models on customer “families” and their likelihood to find a room, or nail down the forecasted room cost, for instance.
11:30 a.m.: A few doors down, PAW program chair Eric Siegel is going through some elements of his recent, solid pop-sci book, including Chase Bank’s customer risk models on mortgages. He then cites the once oft-touted predictive models around cell phone company customer retention. As companies rested on predictive models that offered existing customers a new phone at some point in the future, they instead “instigated the reverse effect.” By that, Siegel means some customers can be predicted to stay put by their contentment as what he calls (tongue-in-cheek), “sleeping dogs.”
11:55 a.m.: Kelly Zhao, Fifth Third Bank, is going through complicated but relevant models related to the probability of home owners to default on their loans, and related “stress testing” scenario analysis required by federal regulators. Heavy on math – which is welcome to the modeler pros in the room, for sure – though there’s definitely a compliance heartbeat behind her work to expand the “narrow analysis” of banking limits as required by the feds.
12:27 p.m.: More vendors and new names from last year and, in completely cursory look around the room, a much younger vibe than a lot of the other conference circuits. That could be the related to the availability of chocolate cheesecake, or, in one of the weirder tchotchkes I’ve ever seen, the drawing for the remote control helicopters at the ClickTale booth. Lunch hour traffic and discussion around A/B testing vendor Optimizely and analytic standby KXEN booths, too, for what it’s worth.
2:12 p.m.: IBM Watson preso, lots on big data and the trial and error of the trivia show winning computer, but not a ton on the predictive side of its analytics. Health care use case with Watson sound as though they’re following through on the promises when it was first introduced. Also, refreshing to hear the “trials and tribulations” with the development of the big analytic brain behind Watson, where at one point IBM thought they may just have a “science project” on their hands, according to Worldwide Marketing VP Stephen Gold.
2:38 p.m.: Solid round of “elevator pitches” from AbsolutData, StatSoft, Mu Sigma, Forum Analytics and, from a bit outside of the traditional sources, Monsanto and the University of California-Irvine Extension (UCI). Advanced modeling, mining and business-minded analytics skills are big on the mind of leading data-driven enterprises, so it’s refreshing to see academics join the fray. UCI works with PAW in some instruction, but is still the only university with an active presence at the event, from what I can tell today, at least. Monsanto has been “bitten by the big data bug” and made its pitch to modelers and analysts in the crowd to hear more about its “candy store” of genomic, molecular and agricultural data.
2:35 p.m.: Jim Tesiero, principal mathematician and head of data science for San Diego-based ad agency, Zeeto Media, getting into graph-theoretic frameworks for targeting customers more accurately. We’ve had the discussion over the last few years on our site about the hard and soft science differences with data science, but Tesiero brought an impressive background in physics to his models to increase conversion rates from online advertisements. To winnow 1,400 features from the original dimension of their models based on what elements were truly predictive of buyers and ad respondents, Tesiero found a handful of “neighbor” interactions, those elements that have the most influence on users to buy given their background, geographic background, or other demographics. His team was able to increase conversation rates and define with more accuracy the types of buyers for varying ads. Tesiero likened it in way to microscopic interactions in physics. But in more business or public minded terms, he says: “Think of it like a social network: there may be someone who has more influence ... or has more authority over a network group in general.” (As a side note, it’s great to hear from an actual data scientist on actual business uses.)
3:14 p.m.: Somber talk on the lack of quality data coming out of health care systems, with a silver lining from the possibilities in matching up claims, compliance and provider data from Stephen Omans, CEO and founder of Deal Me Health, a Chicago-based online scheduling vendor. Omans says the biggest obstacle in converting to ICD-10 is the “amount of codes” that are increasing and include alpha-numeric data.
4:45 p.m.: Looking ahead at schedule for Wednesday, the final of the event, there are some intriguing talks scheduled on PA for collections by payroll processor Paychex, communications between analysts and senior management by Managed Analytic Services and the reliably smart Dean Abbott giving his “five predictive analytics pet peeves.” May have to make room for two days next year.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access