Saturday, March 19, 2011

Poll aggregation methodology

The following is a detailed description of how the polling aggregate is calculated. While this method is similar to how the vote projection is calculated in the run-up to and during an election campaign, there are some differences.

Currently, the methodology described here applies to the poll aggregations being maintained for the federal political scene and the provincial situations in Ontario and Quebec.

Poll aggregation

The weighted polling average includes all publicly available opinion polls. Polls are weighted by their age and sample size, as well as by the track record and past performance of the polling firm.

The weight of a poll is reduced by 35% with each passing week. In polls taken over multiple days, the median day of the poll is used for the weight. For example, a poll taken between March 12 and March 14 would be dated for March 13.

The sample size weighting is determined by the margin of error that would apply to the poll assuming a random sample. The margin of error for a poll with a random sample of 1,000 people, for example, is +/- 3.1%. A poll with a sample of 500 people has a margin of error of +/- 4.4%. Rather than giving the poll of 500 people half the weight of the poll of 1,000 people, the smaller poll would be weighted at 70% (3.1/4.4) of the larger poll.

Polling firms are also weighted by their track record of accuracy over the last 10 years. Their accuracy rating is determined by three factors: 1) the last poll the firm released in an election campaign, 2) their average error for all parties that earned 3% or more of the popular vote, and 3) the amount of time that has passed since the election. In order to take into account changes of methodology or improvements made over time, the performance of a polling firm in a recent election is weighted more heavily than their performance in an older election. The difficulty of each election is also taken into account: elections where the average error was lower are weighted more heavily than elections in which the error was higher. This is meant to take into consideration elections in which there were particular factors contributing to pollster error that were outside of the pollster's control. Conversely, in elections where the consensus was close to the mark a pollster has fewer excuses for higher error levels.

The accuracy rating is determined by comparing the average error, weighted by how recent the election is, of the best performing polling firm to others. For example, if the best performing firm had an average error of 1.5 points per party, a firm with an average error of three points per party would be given half the weight.

All of these ratings are combined to give each poll in the aggregation a weight. In sum, this means that newer polls with larger sample sizes from polling firms with a good accuracy record are weighted more heavily than older and smaller polls from firms with a bad track record.

A firm with no prior experience is automatically weighted at half the value of the best firm.

When more than one polling firm is active and another poll was conducted recently, no single poll is allowed to have a weight of more than 66% of the entire projection.

The performance of this method

This adjusted and weighted poll aggregation performs better than most individual polls and better than an unweighted and simple averaging of the last polls of a campaign. An analysis of shows that the model used by performed better, on average, than 19 of the 26 polling firms active in at least one of the campaigns in which the model made a projection and was about 23% better than a simple average of polls and, on average, better than the average performance of the polls in 10 out of 12 elections. Of the seven firms that have had a better average performance, ThreeHundredEight has bested all but two of them in individual election campaigns.