site banner

Nate Silver: The model exactly predicted the most likely election map

natesilver.net

Key excerpt (But it's worth reading the full thing):

But the real value-add of the model is not just in calculating who’s ahead in the polling average. Rather, it’s in understanding the uncertainties in the data: how accurate polls are in practice, and how these errors are correlated between the states. The final margins on Tuesday were actually quite close to the polling averages in the swing states, though less so in blue states, as I’ll discuss in a moment. But this was more or less a textbook illustration of the normal-sized polling error that we frequently wrote about [paid only; basically says that the polling errors could be correlated be correlated between states]. When polls miss low on Trump in one key state, they probably also will in most or all of the others.

In fact, because polling errors are highly correlated between states — and because Trump was ahead in 5 of the 7 swing states anyway — a Trump sweep of the swing states was actually our most common scenario, occurring in 20 percent of simulations. Following the same logic, the second most common outcome, happening 14 percent of the time, was a Harris swing state sweep.6

[Interactive table]

Relatedly, the final Electoral College tally will be 312 electoral votes for Trump and 226 for Harris. And Trump @ 312 was by far the most common outcome in our simulations, occurring 6 percent of the time. In fact, Trump 312/Harris 226 is the huge spike you see in our electoral vote distribution chart:

[Interactive graph]

The difference between 20 percent (the share of times Trump won all 7 swing states) and 6 percent (his getting exactly 312 electoral votes) is because sometimes, Trump winning all the swing states was part of a complete landslide where he penetrated further into blue territory. Conditional on winning all 7 swing states, for instance, Trump had a 22 percent chance of also winning New Mexico, a 21 percent chance at Minnesota, 19 percent in New Hampshire, 16 percent in Maine, 11 percent in Nebraska’s 2nd Congressional District, and 10 percent in Virginia. Trump won more than 312 electoral votes in 16 percent of our simulations.

But on Tuesday, there weren’t any upsets in the other states. So not only did Trump win with exactly 312 electoral votes, he also won with the exact map that occurred most often in our simulations, counting all 50 states, the District of Columbia and the congressional districts in Nebraska and Maine.

I don't know of an intuitive test for whether a forecast of a non-repeating event was well-reasoned (see, also, the lively debate over the performance of prediction markets), but this is Silver's initial defense of his 50-50 forecast. I'm unconvinced - if the modal outcome of the model was the actual result of the election, does that vindicate its internal correlations, indict its confidence in its output, both, neither... ? But I don't think it's irreconcilable that the model's modal outcome being real vindicates its internal correlations AND that its certainty was limited by the quality of the available data, so this hasn't lowered my opinion of Silver, either.

9
Jump in the discussion.

No email address required.

This makes me think of someone stuck on a very sticky wicket, trying to justify an argument that was fundamentally wrong. Of course there are facets of any sophisticated but wrong argument that are right. You can highlight the correct facets and minimize the wrong facets. You can pre-prepare reasons for why you might be wrong to conserve credibility.

Nate has the rhetorical skills to pull it off. But it still feels very slimy. The 90 IQ twitter pleb mocking him with '60,000 simulations and all you conclude is that it's a coin flip?' may not be that numerically literate. But he has hit on a certain kind of wisdom. The election wasn't a 50/50 or a dice-roll. It was one way or another. With superior knowledge you could've called it in Trump's favour. Maybe only Bezos and various Lords of the Algorithms, French Gamblers and Masters of Unseen Powers knew or suspected - but there was knowledge to be had.

I prefer prediction models that make money before the outcome is decided, not ones that have to be justified retroactively. Nate wasn't heralding before the election that this 6% was the modal outcome, it wasn't really useful information.

The election wasn't a 50/50 or a dice-roll. It was one way or another.

Before it happened, it wasn't. Even if you had universal legilmency and knew the political views of every voter as well as the voter knew themselves, the result could differ from the legilmency-poll because of differential turnout (which can be affected by unpollable things like the weather on polling day) or late swing (some voters actually change their minds in the 3-4 days between the field work being done for the eve-of-poll polls and the actual election).

If the exit polls are correct, the Brexit referendum was decided by people who made their mind up day-of.

It was one way or another. With superior knowledge you could've called it in Trump's favour. Maybe only Bezos and various Lords of the Algorithms, French Gamblers and Masters of Unseen Powers knew or suspected - but there was knowledge to be had.

That knowledge wasn't available to the model. A French gambler paying a hundred grand for a private poll specifically does so in order to possess information that others do not.

"My model produces unhelpful outputs because it has bad inputs" is still only an excuse at the end of the day. Nate is a pretty influential guy, famous, respected by many. Why doesn't he have six figures to spend on his own poll and make his model better? Do none of his rich friends trust him enough to invest in him?

"My model produces unhelpful outputs because it has bad inputs" is still only an excuse at the end of the day.

It's not a matter of the model having bad inputs. The model had all the publicly available inputs.

Why doesn't he have six figures to spend on his own poll and make his model better? Do none of his rich friends trust him enough to invest in him?

Anyone that would pay for that would want to keep the results private in order to better leverage them in some fashion. Otherwise why are they paying for better polling just to give it away to everyone? What return do they have to reap out of investing in a better prediction? The intrinsic value of better public polling?

I'd also comment that even polymarket was 50/50 for a while and then 60/40 in the days before the election.

Otherwise why are they paying for better polling just to give it away to everyone? What return do they have to reap out of investing in a better prediction? The intrinsic value of better public polling?

While I basically agree that Nate Silver did as good a job as possible, this is a real problem. Garbage in, garbage out. He built a model that relied on free public information, and the quality of that information has degraded over time. I think it's entirely possible that his "business model" (or whatever you want to call it) is no longer viable. Once upon a time there wasn't an Internet, and then there wasn't enough data on the Internet, but eventually we entered the age of Big Data. Now maybe it's ending.

One of the reasons we used to have good polls is that we had well-funded mainstream media sources that were interested in accurately reporting the state of reality. But funding went down, the number of sources doing ground-level reporting shrank, they've become more cautious about taking risks, and most importantly, many of them have stopped caring about reporting reality, and are more interested in shaping reality toward their preferred political pole, or almost worse, they just say whatever the current party line is.

Do you have a substantive disagreement with his argument?

If Nate put his money where his mouth was, he'd have lost $100,000. He talks the talk (after it's decided) but doesn't walk the walk when it actually means anything.

https://x.com/NateSilver538/status/1842211340720504895

Did the other guy send the contract?

He says so and that Nate later refused to sign.

Did the other guy provide proof that he sent the contract?

A Twitter exchange is in fact a form of contract -- so whether the guy sent Nate a piece of paper saying "I will pay Nate Silver 100K if Florida goes less than R +8, otherwise he will pay me", I think the terms of the bet were pretty clear.

I certainly wouldn't require Nate to pay up based on the Twitter exchange, but that would definitely be the Honourable thing to do -- he can probably afford it based on what he's charging on Substack alone, and it would be great degenerate-gambler PR for him to do so.

A Twitter exchange is in fact a form of contract -- so whether the guy sent Nate a piece of paper saying "I will pay Nate Silver 100K if Florida goes less than R +8, otherwise he will pay me", I think the terms of the bet were pretty clear.

...?

If the Twitter exchange is in fact a form of contract, then so is the stipulation of said Twitter exchange for the requisite next step- which includes Nate's condition that the other person send a formal contract via lawyer. If the guy sends a piece of paper saying what you say, it would be failing to meet the conditions of the terms of the Twitter-contract.

If the Twitter exchange is in fact a form of contract, then so is the stipulation of said Twitter exchange for the requisite next step- which includes Nate's condition that the other person send a formal contract via lawyer.

Just so -- that's why I wouldn't fault Nate for not paying up. But the whole point of honour culture is that one feels the need to go above and beyond what's legally required, even when it's to one's own detriment. It's not like the bet was unclear or something -- the sporting thing to do would be to chuckle and write a cheque.

More comments

This is tangential to my main point. Nate's beliefs about the world, if expressed, would've cost him a lot of money. There are probably large numbers of people who trusted Nate's modelling and lost money thinking 'oh well if Nate says it's 50/50 then I can profit from Polymarket being 70/30'.

I think Nate is trying to claw back lost prestige. It reeks of cope to say 'my model exactly predicted the most likely election map' when his model gave it a 6% chance of happening. He struggles to even say what he wanted to happen on election day from the perspective of 'what makes my model look good'. If you're not producing a useful, falsifiable prediction then what is the point?

The important thing is getting it right. I want models that give alpha. If I'm going to pay for a substack, I'd rather pay for someone who got it right - Dominic Cummings who said confidently that Kamala would lose, based on his special polling methods. He actually predicted a Trump victory vs Kamala 311-227 in 2023. He foresaw that Biden would likely become obviously senile, that the Dems needed a better candidate.

https://x.com/Dominic2306/status/1854275006064476410

https://x.com/Dominic2306/status/1854505847440715938

Let's say he had bet $100,000 at 50-50 odds that he wouldn't roll a six on a die. Then he rolls a six. Does that prove something about his beliefs? It's only profitable in expectation. There is no guarantee of making money.

To take the election example, 50-50 means losing half the time. It's only profitable because, when you do win, you win more than you would have otherwise lost.

If you're not producing a useful, falsifiable prediction then what is the point?

That is just not possible to get from a single sample. You need to look at his track record over many elections.

This is tangential to my main point.

If that is so, I accept this correction in good faith, and I do believe this elaboration of the main argument is substantially stronger. I am not attempting to change your opinion on Nate Silver's accuracy.

I am still curious of if there was ever evidence that Silver's bet was accepted, both for it's own sake in addressing the question and to updating priors, but an argument of 'he would have lost money if he made the bet' is a substantially different argument than 'he refused to respect his own challenge,' and if you did not mean the later interpretation I am grateful for the clarification.

There are probably large numbers of people who trusted Nate's modelling and lost money thinking 'oh well if Nate says it's 50/50 then I can profit from Polymarket being 70/30'.

how does it work?

To be fair, in the absence of Nate denying it, I don't think he necessarily needs to provide proof.

That would not be fair. In the absence of Nate confirming that he refused to sign a contract, a claim of having sent the contract is just a claim absent further evidence.

My curiosity / eyebrow is raised because Ranger is raising this bet as a character failure on the part of Nate Silver, but the proffered evidence is of the conditional offer of a bet, not that the bet was accepted as offered but that Silver refused to sign it.

This leads to a couple of issues for which more information than has been provided is needed.

-Did the other person actually accept the bet, or are they just claiming so with post-election hindsight? (i.e. is he talking the talk after the election is decided?)

-Did the person try to modify the terms of the bet offered that would render the offer void? (i.e. did he refuse to walk the walk when it mattered?)

-Did the person fail to meet the conditions of the offer of bet? (i.e. did they not have their lawyer do it, but tried to make their own contract- thus invoking the payment risk issue raised?)

I've no particular strong feeling on Nate Silver one way or another, but if someone wants to make a character failure accusation with linked evidence I'd generally prefer the links to be evidence of a character failure.

That would not be fair. In the absence of Nate confirming that he refused to sign a contract, a claim of having sent the contract is just a claim absent further evidence.

I disagree. All Nate has to do is say "no you didn't, you fucking liar", and if Keith can't provide evidence of sending him the contract, he's the one that's going to suffer reputational damage. On the other hand, if Nate says that, but Keith promptly provides evidence, this will look even worse for him. Since Nate knows for a fact whether or not he received the contract, his decision on how to react to the claim tells us something about the truth value of the claim that he was sent the contract.

There are also scenarios that would explain a lack of reaction. Maybe after the spat Nate blocked Keith, and has no knowledge that he's now going around claiming that he sent the contract. So while the lack of reaction doesn't outright prove the contract was sent, I maintain that the potential reputational damage that can result from the claim is a weak form of evidence in itself, and thus it is the demand to provide hard evidence that's unfair.

More comments

Nate wasn't heralding before the election that this 6% was the modal outcome, it wasn't really useful information.

I don't have links or citations, and most of his commentary was paywalled so I only saw the public-facing free posts, but as far as I remember, he very much made the point that the '50-50' odds didn't mean the final result would be a 50-50 split. His article on polls 'herding' very much pointed out that polls had a standard margin of error, and thanks to herding it was impossible to say if they would fall one way (polls systematically undercount Kamala's support, and she sweeps the 7 swing states) or the other (polls undercount Trump and he sweeps the 7 swing states). However, by far the most likely outcome was one or the other. I don't think he specifically called out the modal outcome (Trumps wins 312 EC votes) as such, but it was clear to me going in that the final result of the night would be a) very close in the popular vote and b) probably a blowout in the EC.

I was liveblogging the Election Night with my high school 'Government & Economics' class, and I sent them Silver's article on herding for the class to read beforehand, with this commentary:

There's a statistical concept called 'herding' that seems to be affecting many (most?) swing-state polls. Pollsters don't want to be wrong, or at least not more wrong than the rest of the field, so if their poll shows results that are different than the average, they stick 'em in a filing cabinet somewhere and don't publish them. The problem is, we don't know what those unpublished polls say, so the state of the race may be considerably different than the current forecasts -- either more in Kamala Harris' favor, or Trump's. It's very unlikely for this election to be a blowout in the popular vote (though a small swing in popular vote could result in a major electoral college win for one candidate) but be warned that the Presidential results may be quite a bit different than your current expectations.

I followed Silver's model closely, as well as Polymarket, and I was not surprised by the Election Night results. I understood that there was a lot of uncertainty, and that 'garbage in, garbage out' in terms of polls herding (and in terms of that Selzer poll), and I found myself highly impressed at Silver's analysis of the race.

And here was my commentary at the end of Election Night:

the polls were absolutely right about how close this election was. Trump's results tonight are very much within the 'expected error' for most polls -- he isn't winning by 5% or 10% nation wide. The polls indicated that Kamala was favored to win the popular vote by about 1%, but with 'error bars' of +/- 3% or so. Trump is currently expected to win the national popular vote by about 1%, which is a difference of 2%. That small amount is enough to push a bunch of swing states into his win column in the Electoral Vote count, but I want to emphasize that even though he's favored to win, and he almost certainly will win a huge majority in the Electoral College, this was still a nail-biter of an election.

this was still a nail-biter of an election.

It wasn't as close as 2020 in terms of the number of votes, but it was still a margin of ~300k in the key swing states between a Trump win and a Harris victory.

How can news sites call it so early if it's such a small margin at the end?

The amount of votes you need to form a representative sample is smaller than a lot of people think. So once you have the first few thousand votes counted in any given county, you have a very very good sense of how the rest of that vote in the county will be distributed with a relatively small margin of error. Based on that, after a certain number of counties start reporting results, you can often quickly reach a point in some of the more lopsided states where regardless of the distribution of votes in future counties the vote is already effectively decided. And on closer states like the swing states once all the areas are reporting and have a large enough sample of results, even what seem like relatively small margins (like 51% to 48%) can give you the confidence to call a final result on the more-or-less ironclad assumption that the rest of the votes to be counted will have very similar distribution.

It's really only on the very very close races that it might take more than a day, or multiple days, to arrive at a result.

Pollsters don't want to be wrong, or at least not more wrong than the rest of the field, so if their poll shows results that are different than the average, they stick 'em in a filing cabinet somewhere and don't publish them.

We should require pre-registration of polls. Have some of the major news networks say they won't publish them unless they are registered, in advance, with a clear notion of when they will take place and when the results will be reported.