For the first three weeks of January, I thought Chicagoland was certain to get a second straight reprieve from harsh winter. Every day, we were setting a record for the number of consecutive days without an inch or more of snow. And the temperatures hovered in the 20s, 30s and 40s – seemingly far warmer than historical Januarys.
Alas, winter has returned to the area with a vengeance over the last three weeks. While spared the 2+ foot blizzard that just hit the Northeast, we’ve had more than enough snow, ice, rain, wind and bitter cold of our own. And to think, there’s a least another month of the same ...
A few days ago I headed out on my early morning four mile hike but turned back after just a mile, the cold just too much for me. My iPhone said it was -2 degrees but the weather news on TV noted a -26 wind chill. All I knew was that it was too cold to be outside!
Once I thawed out, I did a little research on the wind chill index (WCI). For those from more temperate climates, the WCI describes “the relative discomfort/danger resulting from the combination of wind and temperature ... an equivalent temperature at which the heat loss from exposed flesh would be the same if the wind were near calm. For example, a wind chill index of -5 indicates that the effects of wind and temperature on exposed flesh are the same as if the air temperature were 5 degrees below zero even though the actual temperature is much higher.”
The now commonly-accepted equation for WCI is a function of the temperature (T) and wind speed (WS): WCI = 35.74 + 0.6215*T - 35.75*(WS**0.16) + 0.4275*T*(WS**0.16). In statistical terms, there’s an interaction between wind speed and temperature to determine the wind chill index, and the relationship is curvilinear rather than straight line.
Ever the geek, I just had to take a look at the WCI in R. To do so, I first created an R data frame with attributes T, WS and WCI over the temperature range -45 to 40, with wind speeds from 5 to 60 miles per hour using the equation above. I then graphed the 3-way relationship using the level plot with contours from the R lattice package. Figure 1 shows that graph. The level plot does a decent job representing a three-dimensional relationship in two-dimensional space. The “contour” curves – in this case combinations of temperatures and wind speeds that yield roughly the same wind chills – well show the interaction and curvilinear relationships among WCI and T/WS.
Not content to simply visualize WCI, I needed to determine if I could predict it with T and WS using learning models that were unaware of the complex mathematical relationship. I experimented with four such models to find out. The first was a simple linear and additive regression using T and WS; the second, the same as the first with an additional T*WS interaction term; the third, regression with cubic splines in both T and WS and the interaction; and finally, areg, additive regression with optimal transformations on both sides, from Frank Harrell’s splendid Hmisc package.
Figure 2 details the results of plotting the predictions from the learning models against “actuals” from the mathematical equation. The closer the points to the y=x line, the better the fit.
The simple linear model in Figure 1A is the least faithful reproduction, followed by the linear with interaction of 1B. The curvilinear/multiplicative models in 1C and 1D best approximate the mathematical calculations. Flexible, curvilinear learning models such as those that underpin 1C and 1D are often capable of producing predictions very close to those of unknown but sophisticated mathematical functional forms. In R, such models are also easy to use and quickly becoming a staple of the data science arsenal.