06 November 2012

Silver Models

Nate Silver. Where to even start with this whole brouhaha about his election forecasting?

How about here: Conservatives, a lot of you made yourself look like real clowns with your bullshit criticisms. We're talking about some real hackery going on.

I could probably give an hour long lecture on modeling using spurious critiques from National Review Online as the framing structure.

All the criticisms I've seen (at least the ones that rise above "not even wrong" levels) boil down to this: if the state-level polling is wrong, Silver's projections will be wrong. Yes. Thank you. Also good to know: if every thermometer in the state is wrong today then your meteorologist will do a bad job at predicting tomorrow's weather. And maybe state-wide polls do have a consistent bias (in either the technical or colloquial sense), maybe there is some big Bradley effect going on, maybe their sampling technique is off. Who knows? But if every poll is wrong, all in the same direction, then the problem is that every poll is wrong in the same direction, not that Nate Silver is pulling numerical trickeries.

I don't have the patience to wade through all of this stuff with Silver. Besides, Sonic did it better over at RWCG. Go scroll through his posts over the last couple of weeks, or maybe search for "Silverbating."

This was a good piece at HBR:
HBR Blog Network | Justin Fox | What Does Nate Silver's '80.9% Chance of Winning' Mean, Anyway?

Nate Silver, the KPMG-consultant-turned-poker-player-turned-sabermetrician-turned-prescient-forecaster-of-the-2008-election, has suddenly become the Most Controversial Man in America. [...]

In fact, Silver acknowledges pretty much every reasonable criticism of his approach — his writing is a model of intellectual honesty. If Romney crushes Obama on Nov. 6, Silver can be expected to react not with denial, excuses, or silence (the standard responses of the political pundit who gets it wrong) but with a straightforward, fact-filled analysis of where he messed up. If all we were talking about was Silver's writing, in fact, I could stop right here.

But the writing isn't the only thing that draws people to Silver's blog. In fact, I would venture to guess that, say, 80.9% of the visitors to fivethirtyeight.blogs.nytimes.com are there simply to look at his percentage forecast of the election (which as I write this gives Obama an 80.9% chance of winning). I know I check it several times a day to see if the number has changed. I also know that, in many ways, a percentage forecast is the most honest way of calibrating one's balance between confidence and uncertainty. But as a way of communicating uncertainty, the percentage forecast is also deeply flawed.
I get the sense from reading Silver, and from interacting with a lot of quantitative types like him, that he sees this as his readers' problem and not his. And he's right, frankly. Yes, he takes pains to explain what he means, but at a certain point innumeracy is the reader's fault, not his. He doesn't — and shouldn't — run a remedial math blog.
We see more certainty than is actually there. Or, as Steven Alexander jokingly tweeted about Silver's forecasts:
2:1 odds are basically a lock. The probability of losing is like 1 in a million.
I'm really glad he's joking. I bet most people would read this and think "yeah, that sounds about right" not "ummm, that's off by several hundred thousand."
Silver has repeatedly, and not very convincingly, tried to explain his percentage forecasts in the context of a football game: Romney is behind, but could still win with a last-minute touchdown. The problem with this metaphor — as, of course, Silver has acknowledged — is that we don't actually know the score of the game. We're standing outside the stadium and guessing the score based on crowd noise. So the source of uncertainty resides at least as much in the potential for mismeasurement as in the potential for last-minute game changers.
This metaphor is just spot on for me, mostly because I spent some time last week building a Monte Carlo model in Matlab to help me make fantasy football roster decisions. Yes, I am that big of a geek. But I'm also tied for first in my league, so I'm doing something right.

It's actually a great exercise. So please, journalists, before you spend an afternoon writing a critique of Nate Silver or some other modeler, please try to model something. Anything. Open up a spreadsheet or Octave or NetLogo or any of several other free software systems and muck around a little.
What Silver's 80.9% forecast technically means is that, if the Obama-Romney 2012 election were contested 1000 times, he thinks Obama would win 809 of them.
That's not what it "technically" means, that's just what it means. Full stop.

I see a lot of people who get upset and say things like "Damn it, weatherman, you said there was a 30% chance of rain and it didn't rain!" If the weatherman says there's a 30% chance of rain ten times, it shouldn't rain seven of those days. Not that complicated.
I happen to find Silver's reasoning pretty convincing. But I still don't know quite what to make of his 80.9%. Maybe we'd be better off if he just expressed it as a scatter graph of the results of his thousands of Monte Carlo simulations. [...]

Update: Economist Richard Thaler points out that if you scroll down the page a bit at fivethirtyeight.blogs.nytimes.com, there's a chart showing the distribution of potential outcomes generated by Silver's model. Let's have a show of hands of how many of you have ever actually looked at that.
Me. I'm raising my hand right now. I do that. It's a much better way to get the look of a distribution than a single scalar value, as anyone who has read Tufte should know.

Here's a distribution plot I generated to help me decide which of two kickers I should pick up as a back-up.

One player is red, the other blue. The solid and dashed lines use different assumptions for the distributions that are irrelevant to the present discussion. Just looking at the means of Red and Blue isn't very helpful, as they only differ by a quarter of a point. Knowing that the red player will score more points in 422 out of 1000 match ups (according to the model, of course) is better. Seeing the shapes of the curves gives me extra info that scalars alone do not.

And even if you're good at interpreting mu and sigma for a normal distributions, very, very few people are good at interpreting the parameters of other distributions. But it's not hard to interpret those same distributions when you plot out the pdf. So yes, look at graphs.
[This histogram is] one more example of Silver's intellectual honesty; it also doesn't have much chance of affecting our probability beliefs as much as the percentage does.
Asserted without evidence.
Or [Silver] could follow the lead of conservative pundit Timothy P. Carney, who made an election forecast this week that paralleled Silver's in most aspects but did without a percentage. Instead, Carney — after going through an if/then analysis on all the swing states — concluded with a range of potential Electoral College outcomes:
The 292-246 Obama victory is most likely, but I wouldn't be surprised at all by anything up to 331 for Obama, or up to 269 (a tie) for Romney.
I really like Carney, and ranges are fine, but they're no substitute for plots. They give you more information (they are more numbers than a scalar value, after all, so they ought to) but they still don't capture scale, skew, etc. while the histogram does.
And if you're a business executive or money manager trying to make investment decisions contingent on the election outcome, there are so many other complications that the difference between 80.9% odds and 70.9% or 60.9% may not be all that significant.

That's the way it goes with decision-making under uncertainty. Sometimes, by summing all that uncertainty up in a single number, it can feel as if you've made it go away. You haven't.
*AHEM* Value At Risk calculations *AHEM*.

Somebody take those last two sentences, travel back in time to 2007, and tattoo them on foreheads all around Wall Street.
Mathematician John Allen Paulos tweeted, regarding the trouble that so many seem to have election probabilities, that:
Many people's notion of probability is so impoverished that it admits of only two values: 50-50 and 99%, tossup or essentially certain.
He meant this as an insult, I think, but it's actually a good description of how humans naturally think about probability.
It can't be both an insult and an accurate description?

See also: Popehat | Life is Not a Coin Flip
I'm not blaming Silver for anything here [...] but those who wish to communicate risk and uncertainty do need to be aware that most of their audience has a problem with probability.
I'd be surprised if anyone capable of decent quantitative analysis wasn't acutely aware of how innumerate most people are.


  1. "technically means is that, if the Obama-Romney 2012 election were contested 1000 times, he thinks Obama would win 809 of them"

    I've actually become convinced that this frequentist approach is the wrong way to try to explain these probabilities to a layperson. That's because, you just know that they're going to get hung up on 'but that's dumb because there's only going to be one election, not 1000!'

    If I had to develop a fuller explanation from scratch I think that instead I'd borrow the David Deutsche-style many-worlds explanation of quantum mechanics:

    There are different possible futures because there are multiple universes. We don't quite know which universe we're in so we don't know our future (in particular, we don't know who will win the election). We *do* know some things about our universe though. For example, we know that in our universe, pollsters have found a 3% Obama lead in Ohio. But out of all the possible universes with an Ohio and pollsters finding O+3%, basic stats tells us some ~85% of those Ohios *really do* prefer Obama and are going to majority-vote for him. Thus, once we narrow down (using polls) possible universes we could theoretically be in, we look and notice that in ~85% of their futures, Obama wins today.

    And that's all that that 'probability' means: the share of our future-possible-universes (given data we know, i.e. polls) in which Obama wins.

    Admittedly, I'm not sure this would work better than just talking about N=1000, but sometimes I think it's worth a shot...

  2. I like that approach. Provided you don't scare people away as soon as you say "multiple universes" I think this is a very good way to approach it.

    I think sports analogies can help with people objecting that there's only going to be one election, not a thousand. People know there's only one Super Bowl each year, but they don't have too hard a time imagining that the Giants and Patriots could play each other a thousand times. I've explained that to people and followed up with the observation than you can similarly imagine a thousand elections and it seemed to work well enough.

  3. I've been looking at doing a monte carlo simulation in MATLAB but after looking at Nate Silver's blog I figured he could do all that analysis much better than I could.

  4. Oh he can definitely do it better than I can. No doubt. But I think if people are going to be writing professionally about things like election models they ought to take some time to make even a very simple model of something, just to get an idea for what's involved.

    Also if you take a look at the simple model over at RWCG you'll see that you don't need very much sophistication in order to reproduce a lot of the same effects that Silver's (presumably) much more nuanced model does. Typically there's a lot of diminishing returns in terms of model detail, such that once you pass a certain threshold of correctness, doing a much, much better job only gets you slightly better results.

  5. "I'd be surprised if anyone capable of decent quantitative analysis wasn't acutely aware of how innumerate most people are."

    Or even crappy quantitative analysis.