What Went Wrong

In 1936, the Literary Digest infamously predicted a landslide win for Alf Landon over FDR. In 1948, Gallup even more infamously had Dewey decisively besting Truman. And now, in 2016, yet another reckoning: “Clinton Trumps Trump” has yielded to “Trump Triumphs.”

No doubt it would be easy to lump these failures together. However, it would also be a mistake. Landon’s hopes and Dewey’s aspirations ultimately lost out to fundamental errors in polling methodology. By contrast, Clinton fell prey to something far more befitting the 21st century. The problem wasn’t with the polls, but with how they were aggregated — or more specifically, with the sophisticated models that were built on top of them.

Where the models went wrong was this: they assumed that the fifty state elections were largely independent. Think of the election as a series of fifty coin flips. One way to model that series is by flipping fifty distinct coins exactly once, with each coin biased in a way that’s both unique and random. The allure of this model is its simplicity. If each state’s bias is independent from the others, then over fifty states the biases will ultimately cancel out, and the actual results will fall in a narrow band around the national polling average. (As best I can tell, this is why Huff Post’s model had a win probability for Hillary of 98%.)

However, that model is almost certainly not true. In reality, many state elections are likely to be biased in a similar way. In that case, the election overall is more like fifty flips of the same biased coin. If we model the election that way, then the actual election results will converge on the polling average plus the consistent bias. The polls for PA, NC, FL, MI, and WI, among others, suggest that this is likely what happened: the polling error in each of these states moved in the same way, due to mistaken likely-voter estimates for white citizens without a college degree. In that regard, it’s not a concidence that Nate Silver, the aggregator who most strongly accounted for the possibility of correlated state-level errors, also had the greatest uncertainty in his forecasts.

So where to go from here?

To be sure, the polls themselves could be improved. [1] But again, they were not the problem. The median polling error was only around 4%, and even then the polls got the popular vote right. It’s the aggregation that needs work. And to be clear, it’s not that the aggregators should have identified the specific bias among white, non-college voters — it’s that they should have accounted for the possibility that that kind of bias might exist. In the next election, aggregators either need to pay more attention to how they model correlated errors, or publish clear disclaimers about potential sources of uncertainty that are not represented in their models.

Finally, one last point. This is the same mistake that contributed to the housing bubble a decade ago, just prior to Obama entering office. As with the aggregators today, the banks back then assumed that state and regional real estate markets were all independent — many separate coins, so to speak, rather than one coin flipped many times. In retrospect treating sub-national errors in this way was clearly wrong then and is clearly wrong now. [2]

The first time we made this mistake it was devastating for millions. But at least then we could argue, with something of a straight face, that we didn’t know better.

We have no such luxury here.


[1] In particular, there need to be greater incentives for pollsters to release their raw data. When the Times commissioned a survey of Florida voters last September and asked five separate teams to analyze it, it’s telling that the one team who got Florida “right” also used a unique likely voter screen. Ideally, this kind of open analysis should be the norm, not the exception.

[2] I’d also go a step further than this, and suggest that in this case the bias probably wasn’t just correlated across states in the U.S. If you had tried to model U.S. and U.K. politics together one year ago, you likely would have found that the national-level errors were correlated too — i.e., that the polls had a similar downward bias in their likely-voter screens for white non-college voters.

Where Do We Go From Here?

There was a great panel at Brookings this week on counter-terrorism and the future of global jihadism, hosted by Will McCants and featuring Ambassador Kaidanow, Dan Byman, and Bruce Riedel. (If you missed it, the audio is here.)

Now that I have a minute, I thought I’d offer up a few quick thoughts in response:

  1. We’re still in need of new ways to evaluate foreign policy success. At one point there was a conversation about whether our foreign policy has made us safer since September 2001. Intuitively this feels like a straightforward question, akin to asking, say, whether NYC’s crime policy under Bloomberg made the city safer. The catch is that the two questions can’t be answered the same way. When we ask if a policy has made us safer, what we’re really asking is whether the policy has reduced the probability that a given type of violent event will occur. For crime, we can tell fairly quickly if the probability has changed, because crime otherwise occurs frequently. Alas, that is not true for mass-casualty terrorist events, which occur rarely and follow a power-law distribution. Even if our policies worked perfectly and dropped the probability of a mass terror attack to zero, we wouldn’t be able to show that the policies were working within any of our lifetimes.

    I bring all this up not to say that the more/less safe question isn’t worth asking. Instead it’s to say that answers like “we haven’t been attacked since 2001”, or, “but the shoebomber almost got us!” aren’t really the kinds of evidence you can use to answer it. Instead it’s a lot more productive to talk about what we believe the processes underlying terrorism are, how our policies are designed to disrupt those processes, and what the corrollary data say about those specific policy interventions.

  2. Does anyone still think “terrorism” is a useful analytic term? Or put differently: does anyone think terrorism refers to a singular phenomenon? It’s too much to say that Ambassador Kaidanow, who coordinates counter-terrorism policy, seemed to feel hamstrung by the term. But it seemed like she was clearly pushing against it. If both the Islamic State and Al-Qaeda are terrorist groups, then the word doesn’t really tell us much, since each group produces distinct forms of violence that require distinct policy responses. It was encouraging to hear her implicitly acknowledge this.

  3. Saudi Arabia. This is the one thing I wish the panel had discussed in more detail. As Bruce Riedel touched on briefly, the Saudis are facing a kind of perfect storm: the US is re-balancing away from them, their position in the global energy market is declining, and they now face an intensifying domestic Salafi-jihadist insurgency. The country appears set for a long power struggle between forward-looking members of the Saudi royal family and hardline factions of the military and regime. The tail risks of that infighting are something we should be discussing more openly.

  4. There is a truly massive gap between the political science and policy worlds. Coming from an academic context it’s hard to understate how striking this is. I have a lot more thoughts on this, but for now just want to flag it briefly.

  5. Sadly, unlike to his twitter avatar, @will_mccants doesn’t actually pop a Victorian collar. Bummer!