Days before the 2016 presidential election that vaulted Donald Trump to the White House, The New York Times announced that Hillary Clinton had a 91% chance of winning. According to the newspaper’s own polling data, her chance of losing was about the same as an NFL kicker missing a 31-yard field goal.
The Times wasn’t alone in its prediction. Virtually every model, from Nate Silver’s FiveThirtyEight to the Princeton Election Consortium, reached the same wrong conclusion. While Clinton won the popular election by nearly 3 million votes, Trump secured the presidency with 304 electoral votes over her 227. The results eroded the public’s trust in polls and in the pundits who rely on them for talking points. So, it’s not surprising that people are questioning the validity of current polls that show Democratic presidential candidate Joe Biden ahead of Trump, even in the electoral map.
“There’s a lot of market pressure to try to get [polls] to be as accurate as possible. Everybody has been working as hard as they can,” Wharton statistics professor Abraham (Adi) Wyner said about the upcoming election. Polls aren’t perfect, he noted, but they are increasingly valuable in a world obsessed with data science and predictive analytics.
Wyner, who also serves as a faculty lead of the Wharton Sports Analytics and Business Initiative, joined Wharton Business Radio on SiriusXM to explain how poll data is collected and analyzed. (Listen to the podcast at the top of this page.) Wyner discussed three reasons why polls can be so problematic:
- It’s not an exact science.
Despite all the number crunching, polls don’t always call the outcome correctly because there are so many variables involved in an election. The two with the widest discrepancy, according to Wyner, are the estimates that come directly from polling data and those from betting markets, which are money exchanges where people buy and sell futures based on upcoming events.
Predictive models rely on polling data that is collected from surveys and other methodologies, while betting markets are less precise and can factor in different influences, such as the Electoral College. In the last election, bettors favored Trump over Clinton, perhaps because of the gaps in polling, Wyner said. Right now, the bettors are keen on Biden.
“There’s a lot of market pressure to try to get [polls] to be as accurate as possible.”
Biden is expected with near certainty to capture several large states that contribute significantly to the popular vote, including California, New York and New Jersey, Wyner said. That means battleground states are even more critical in this election.
“You really care about the places where the election is close, and you also care about the correlation. That’s the part that makes it hard,” he said. “The basic idea is if one battleground state goes one way, you can expect most of them to go the same way. You don’t expect them to behave independently. Whatever you’re missing, you’re probably missing equally in every place, and that’s where the uncertainty really comes from.”
That’s also where most modelers get it wrong, he said. “They treat the result in Ohio and Florida and Pennsylvania as if independently random. That’s not the case.”
- It depends on who is polled.
Demographics play a key role in election surveys, which means something as seemingly innocuous as the method of contact can change the results. For example, if pollsters are contacting voters only through cellphones, they aren’t reaching individuals who cannot afford internet service or who may only have a landline. Even among cellphone users, many are averse to answering unknown callers because of spam.
“It’s very hard to get a representative sample that avoids what we call selection bias. In that particular case, you’re talking about non-response bias,” Wyner said. “The people who don’t respond, arguably, are very different from people who respond.”
He said statisticians try to control for that by asking demographic questions of those who do respond, then making adjustments.
“But it’s hard, it’s uncertain, and it produces results that are unreliable or questionably reliable,” Wyner said. “So, this is a brave new world. One might wonder if the whole approach of using telephones is outmoded and we need to use other techniques. The problem with other techniques is they are expensive.”
“People can and will change their minds, and they’re going to react to things that happen in the next few months.”
- Voters can — and do — change their minds.
The current political climate is so polarized that it seems reasonable to assume there are no undecided voters in the upcoming election. Citizens are either casting ballots for Trump or Biden, without equivocation.
But Wyner warns against such assumptions because a lot can happen between now and November 3.
“People generally believe that the challenge right now is figuring out who would win, as if the election were tomorrow. But I think that’s a mistaken belief,” he said. There are several weeks to go, “and even though people are mostly decided, people can, and do, change their minds.”
Wyner believes that’s what happened in 1948, when pollsters and pundits got it spectacularly wrong that Republican Thomas Dewey would beat Democrat Harry Truman in the race for president.
“That is where Truman convinced people in the last two weeks to vote for him. I don’t think that was a sampling problem. It’s the famous blown election call, where the forecasts all said that Dewey was going to win but, in fact, Truman won,” he said. “I think that people can and will change their minds, and they’re going to react to things that happen in the next few [weeks].”
A debate between Trump and Biden could change the minds of some voters, as could the progress of the coronavirus pandemic or the stability of the economy.
“You can think about things on all sides that could really change people’s minds,” Wyner said.
Wyner is one of the hosts of Wharton Moneyball, a SiriuxXM show that focuses on sports data. It airs Wednesdays at 8 a.m. EST on channel 132.