site banner

Nate Silver: The model exactly predicted the most likely election map

natesilver.net

Key excerpt (But it's worth reading the full thing):

But the real value-add of the model is not just in calculating who’s ahead in the polling average. Rather, it’s in understanding the uncertainties in the data: how accurate polls are in practice, and how these errors are correlated between the states. The final margins on Tuesday were actually quite close to the polling averages in the swing states, though less so in blue states, as I’ll discuss in a moment. But this was more or less a textbook illustration of the normal-sized polling error that we frequently wrote about [paid only; basically says that the polling errors could be correlated be correlated between states]. When polls miss low on Trump in one key state, they probably also will in most or all of the others.

In fact, because polling errors are highly correlated between states — and because Trump was ahead in 5 of the 7 swing states anyway — a Trump sweep of the swing states was actually our most common scenario, occurring in 20 percent of simulations. Following the same logic, the second most common outcome, happening 14 percent of the time, was a Harris swing state sweep.6

[Interactive table]

Relatedly, the final Electoral College tally will be 312 electoral votes for Trump and 226 for Harris. And Trump @ 312 was by far the most common outcome in our simulations, occurring 6 percent of the time. In fact, Trump 312/Harris 226 is the huge spike you see in our electoral vote distribution chart:

[Interactive graph]

The difference between 20 percent (the share of times Trump won all 7 swing states) and 6 percent (his getting exactly 312 electoral votes) is because sometimes, Trump winning all the swing states was part of a complete landslide where he penetrated further into blue territory. Conditional on winning all 7 swing states, for instance, Trump had a 22 percent chance of also winning New Mexico, a 21 percent chance at Minnesota, 19 percent in New Hampshire, 16 percent in Maine, 11 percent in Nebraska’s 2nd Congressional District, and 10 percent in Virginia. Trump won more than 312 electoral votes in 16 percent of our simulations.

But on Tuesday, there weren’t any upsets in the other states. So not only did Trump win with exactly 312 electoral votes, he also won with the exact map that occurred most often in our simulations, counting all 50 states, the District of Columbia and the congressional districts in Nebraska and Maine.

I don't know of an intuitive test for whether a forecast of a non-repeating event was well-reasoned (see, also, the lively debate over the performance of prediction markets), but this is Silver's initial defense of his 50-50 forecast. I'm unconvinced - if the modal outcome of the model was the actual result of the election, does that vindicate its internal correlations, indict its confidence in its output, both, neither... ? But I don't think it's irreconcilable that the model's modal outcome being real vindicates its internal correlations AND that its certainty was limited by the quality of the available data, so this hasn't lowered my opinion of Silver, either.

9
Jump in the discussion.

No email address required.

"My model produces unhelpful outputs because it has bad inputs" is still only an excuse at the end of the day. Nate is a pretty influential guy, famous, respected by many. Why doesn't he have six figures to spend on his own poll and make his model better? Do none of his rich friends trust him enough to invest in him?

"My model produces unhelpful outputs because it has bad inputs" is still only an excuse at the end of the day.

It's not a matter of the model having bad inputs. The model had all the publicly available inputs.

Why doesn't he have six figures to spend on his own poll and make his model better? Do none of his rich friends trust him enough to invest in him?

Anyone that would pay for that would want to keep the results private in order to better leverage them in some fashion. Otherwise why are they paying for better polling just to give it away to everyone? What return do they have to reap out of investing in a better prediction? The intrinsic value of better public polling?

I'd also comment that even polymarket was 50/50 for a while and then 60/40 in the days before the election.

Otherwise why are they paying for better polling just to give it away to everyone? What return do they have to reap out of investing in a better prediction? The intrinsic value of better public polling?

While I basically agree that Nate Silver did as good a job as possible, this is a real problem. Garbage in, garbage out. He built a model that relied on free public information, and the quality of that information has degraded over time. I think it's entirely possible that his "business model" (or whatever you want to call it) is no longer viable. Once upon a time there wasn't an Internet, and then there wasn't enough data on the Internet, but eventually we entered the age of Big Data. Now maybe it's ending.

One of the reasons we used to have good polls is that we had well-funded mainstream media sources that were interested in accurately reporting the state of reality. But funding went down, the number of sources doing ground-level reporting shrank, they've become more cautious about taking risks, and most importantly, many of them have stopped caring about reporting reality, and are more interested in shaping reality toward their preferred political pole, or almost worse, they just say whatever the current party line is.