Making sense of the bewildering mess of US polling

ABC The Drum

There's a lot of disparate predictions coming out of the US election campaign and keeping apace can be daunting.

With two and a half weeks until the US presidential election, who is winning the race? People looking for an answer to this question face a bewildering mess of contradictory information.

In the past week, the highly respected Gallup tracking poll has given Romney a seemingly massive six point lead, while an equally respectable ABC/Washington Post poll puts Obama up by three points nationally.

One average of national polls gives Romney a very narrow 0.4 per cent edge, but the Intrade futures market, where people bet money on the outcome, gives Obama a 65 per cent chance of winning. Australian bookies are even more optimistic about Obama's chances, paying $3.10 for a Romney win. What is going on here?

Most of our information about how people intend to vote comes from opinion polls conducted by an array of competing organisations, usually with samples ranging from a few hundred to over a thousand. These pollsters are primarily interested in getting the results right, which builds their reputations as quality firms and ensures people take them seriously in the future.

While some American polling organisations are openly partisan (such as the Democratic Public Policy Polling outfit), most of them try to minimise bias. Yet they often get very different results. So who should we trust?

The answer is that we can't entirely trust any poll on its own. These polls are trying to give us a picture of what more than a hundred million voters think based on interviews with a few hundred of them. At their best, they can give us a good estimation of the likely range of support for a candidate, but not a single accurate number.

Experienced readers of polls will know about the "margin of error" that surrounds poll numbers. In a sample of a thousand, pollsters will usually report that their numbers could vary by about three points in either direction. This means that if Romney leads Obama by 48 per cent to 46 per cent, the result is "within the margin of error," and the true figure could be anything from Obama leading Romney by 49 per cent to 45 per cent, or Romney leading Obama by 51 per cent to 43 per cent.

But even these more careful estimates have a caveat that pollsters don't usually mention. The margin of error is also known as a "confidence interval", and the industry standard for a poll is 95 per cent confidence.

So the way you should interpret any given poll is this: 19 out of 20 times, the true number will be within the range obtained by the pollster. Those sound like pretty good odds, but remember that in the current electoral cycle we are often seeing more than 20 polls in a single day. Some of them are going to be off, no matter how well they are conducted.

The sheer volume of polling in the United States is a godsend for statisticians. A basic principle underlying all statistics is the so-called "law of large numbers": when you have a lot of different results from the same kind of test, the average of those results will be closest to the truth, and the average gets closer to the truth the more results you have.

So even if nearly every individual poll is within the margin of error, if we take the average of dozens of polls, we will end up with numbers that are fairly close to reality. Websites such as Realclearpolitics.com specialise in this kind of poll aggregation. However, it is the most bizarre "outlier" polls that often get the most attention in the media.

There are a lot of additional complicating factors. Different polling firms have different sampling techniques, and we don't always know which gets the most accurate representative sample. And we don't always want the most accurate representative sample. In the United States, the key number is "likely voters", which pollsters estimate in different ways.

The single biggest complicating factor in presidential polling is the electoral college. Every state is allocated a certain number of votes in the electoral college, and the winner of the majority in each state gets all of that state's votes.

States like New York and California consistently have Democratic majorities while Texas and Georgia are perennially Republican. The states that really matter are the ones that can go either way, and there are only a few of these: look out for Ohio, Virginia, Florida, Nevada, Colorado, North Carolina, Wisconsin, Iowa and New Hampshire.

Because these states are so important, any estimate of who will actually win the election must take into account what is happening in their state-level polls, which could be different from the national picture. Obama, for example, seems to have a robust lead in industrial Midwestern states like Ohio which have had a relatively quick economic recovery partly attributable to his policies. He does more poorly in places like Florida, which suffered a catastrophic housing collapse from which it has never recovered.

Sites like fivethirtyeight.com and election.princeton.edu estimate probabilities of each candidate winning based on these state as well as national polls, and they rely on their own judgments of which polls are the most important and carry the most weight. Fivethirtyeight, published and hosted by the New York Times, also uses economic indicators to help predict the outcome, though these become less important as election day draws nearer.

One shortcut is to look at the betting markets and futures markets. At websites like Intrade, the punters are doing all the work for you: they look at what the polls are doing and place their bets accordingly; presumably they care about getting an accurate result because they have money riding on it.

These markets are often more stable than the polls themselves, and in the past they have sometimes been more accurate, but they too have their problems. There is little insider information to provide leverage to smart gamblers, and certain polls seem to have inordinately large effects on how the markets act.

So, expect a lot of uncertainty over the next couple of weeks. Different polls will throw up wildly different results, often not because of political bias but because of sheer randomness. It is difficult for anyone to see the whole picture, and good poll analysis begins with being honest about that.

This article was originally published at ABC The Drum

Unpacking Trump 2.0

Allied extended deterrence: Towards a collective framework in the Indo-Pacific

Apply for a PhD scholarship

Bolstering the Quad: The case for a collective approach to maritime security

Rising to the challenge: Delivering Australia’s nuclear-powered submarine program

Making sense of the bewildering mess of US polling

Commentary by