## Where did the change happen? or, The “P” in Polly stands for Probability

**shainen** | **October 27, 2020** | **Market Research, Methodology, Politics** | **4 Comments**

We’ve been getting a few questions on how to understand Polly’s prediction for the number of electoral college votes (EVs) each candidate will get. Specifically, on Monday Trump gained 9 EVs; which states did that change come from?

A simple assumption would be that a state worth 9 EVs flipped to Trump, or some combination flipped back and forth that adds up to 9. Not quite: Polly is an AI, and that means she’s all about *probabilities*.

Polly’s pretty smart, but she’s also smart enough to know that she can’t guarantee the exact vote margin in any particular state: maybe Trump’s get-out-the-vote apparatus is a bit better, and that gets him a few tenths of a percent more votes. Or maybe it rains on the day of the election and, since Trump’s voters tend to be older and find it more difficult to get to the polling station in bad weather, this gives Biden a few more votes. To account for all these little uncertainties, Polly assigns probabilities to these possible outcomes.

The polling numbers given for each state on Polly’s election map are **the** **most likely** results on election day, but it’s possible that the vote is a bit more or a bit less than what is reported. How much more, or less, is given by the ± symbol, followed by a number, example ±3% which shows the margin of error is 3% either way. Digging a bit deeper into our understanding, this might get a bit heavy for some of you.

These polling probabilities can be directly converted into the Polly’s confidence that Trump, or Biden, win the state. Let’s focus on Texas as an example. On Monday Polly thought the most likely polling result was 50.2% for Biden and 49.8% for Trump, with ±3%. This translates to a 52.7% chance that Biden takes the state, and a 47.3% chance that Trump wins the state. Note that if either Biden’s polling goes up, or the error was smaller than ±3, Biden’s probability to win the state increases.

Texas is worth 38 EVs, so when the election counting is finally over (maybe on election day, maybe much later!), either Trump or Biden will be the one with all those 38 votes. However, at this moment time, Polly can’t be 100% certain who that will be, so instead of just guessing (no, AIs can’t really guess), she looks at *both* possibilities. For each of these possible results in Texas, she also looks at the possibilities in all the other states: she considers that if Trump wins Texas, he is also more likely to win Arizona (whatever effect caused his vote to be higher than expected in Texas may also have the same effect in Arizona).

Statistically speaking, the *probability distribution* of the vote over the whole USA may have correlations between states. Polly takes such correlations in account in her calculations. When Polly predicts the total number of electoral votes, she integrates this correlated probability distribution over all possible scenarios. Put simply, she takes each possibility, multiplies it by how likely she thinks it is going to happen, and takes the average. This results in a more robust prediction than just picking a winner for each state, as it takes into account *how likely* each state is to flip, and takes into account the possibility that states will act in surprising ways together. Like how the rust-belt states all went to Trump together on 2016, surprising some who used less sophisticated statistical analysis.

But coming back to our initial question, where do the 9 EVs that Trump gained on Monday come from? Fortunately, we can simplify most of the math to answer this question. On Sunday, the probability for Trump winning Texas was 39%; on Monday this went up to a 47% chance of a Trump win. This change in Trump’s chances leads to an increase in 3 EVs in Trump’s overall average election standing. So a full third of the 9 EV bounce for Trump came from his increasing fortunes in Texas. Looking into Polly’s issue impact scores, the issue contributing the most to Trump’s polling in Texas on Monday was the possibility of a COVID-19 vaccine, thus we can attribute Trump’s bump in Texas to positive news stories about possible vaccines.

Photo by Chris Liverani on Unsplash

**Submit a comment**

Your email address will not be published. Required fields are marked *

Could you please clarify “On Monday, the probability for Trump winning Texas was 39%; on Monday this went up to a 47% chance of a Trump win”? With 2 Mondays, was one earlier than the other, or was one a typo.?

On another matter, it can be argued with strength that elderly are less likely to use Twitter, yet proportionately are more likely to vote. Some may also think that Rep in this age group might be less likely to use Twitter than Dems. If true, how would Polly account for this?

Cheers Dan

Thanks for following us! On your first point, thanks for catching that, we’re going to fix that right now. The first Monday was supposed to be Sunday.

On your second point, you might be tempted to argue that but there’s an awful lot of math that stands behind Polly, and it’s not the pretty kind — it’s statistical math. Polly uses extremely large sample sizes — one third of a million people in this case — in order to get an accurate and representational sample of the population she’s measuring. There may not, proportionally, be many elderly people using twitter — but there are enough to create a statistically and scientifically valid sample. Many pollsters use techniques prone to sampling errors or sample bias — simple because it’s not practical or economical for them to get samples in the 100s of thousands. Our sample sizes are big enough that our models are more accurate and better represent the actual demographic we’re studying.

Cheers!

Emond

Does Polly only collect what people are saying by State Sample, or is she also using representative samples by Each County across the country? Say like Pasco County in Florida, which is an important location in the state?

Polly needs to review youtube live events to pick up the electorate momentum. Based on those live event chat rooms, I’ve observed Bidens getting killed.

Based on recent political events, Twitter is obviously Biden biased, Polly’s data probably skewed. I guess we’ll see soon.