Thursday, May 5, 2011

Weighing the polls

On Monday night, ThreeHundredEight's seat projection model failed for two reasons. First, the 7% reduction in weight given to polls for each day that passed was not nimble enough to capture the huge shift in voting intentions in the last two weeks. Secondly, the polls under-estimated Conservative support by over two points nationally, three points in Ontario, and four points in British Columbia. As the seat projection is only as good as the polls that are put into it, this is part of the reason why the seat projection failed.

So what steps can be taken to fix the problem? This fall, a flurry of provincial elections will occur and ThreeHundredEight will be endeavouring to do a much better job in accurately projecting their outcome. Tuesday's post demonstrates that the projection model can produce good results, and tweaks will be made to make it even better. But clearly the vote projection model has to be overhauled.

The first step was to throw out the weightings given to the last three elections. The idea was to use these elections as an "anchor", ensuring that the projection would not be thrown off too much by the fluctuations in the polls. For example, using this anchor was an effective way to keep the Green vote below the unrealistic support levels at which various polls put them.

But in a wild campaign like that of 2011, the anchor was unnecessarily weighing down the projection. Indeed, instead of being off by a cumulative 12.8 points, the projection would have been off by 12.2 points without the anchor. That means it's gone.

Next, I played around with the vote projection model and changed only the rate of depreciation of each poll's weight by age. The following chart shows the cumulative error between the vote projection and the actual result for the five main parties at various reduction rates (i.e., 0.75 means a daily reduction of 25%).
As you can see, the best weighting was between a daily reduction of 20% and 30% (0.80 to 0.70). Delving more deeply, I found that a reduction of 23% per day had the best result, bringing the cumulative error down to 6.8 points for the five parties, an average of 1.4% per party.

Rates of 21% and 22% also resulted in a cumulative error of 6.8 points, but after looking at how it would have done at the regional level for Ontario, Quebec, and British Columbia, I settled on 23% being the closest.

Nevertheless, the projection would have still been off. Though it would have been very close for the New Democrats and Bloc (on the money for the Bloc and off by 0.2 points for the NDP), it would have been further off for the Conservatives (36% rather than the actual 39.6%) and the Liberals (20.5% instead of 18.9%). In fact, for the Tories it was impossible to get closer than 36.3% using a weighted average of all polls.

The pollsters are partly to blame for that. If I dropped the weighted projection completely and simply used the average of the final poll from each polling firm released within the last five days of the campaign, I would still have only gotten 37.4% for the Conservatives, off by 2.2 points. Most significantly, the Conservative average would have low by 4.6 points in British Columbia and 3.5 points in Ontario - that is where the Tory majority was won.

Nevertheless, abandoning the weighted vote projection for a simple average would have given a better result, with a cumulative error of 5.8 points. The average cumulative error in British Columbia, Ontario, and Quebec would have been 8.5 points, rather than the 9.6 points when using a reduction rate of 23% in the weighted average.

At this point, one would suggest that I move to a polling average of only the most recent polls and be done with it.

However, the 2006 election tells a different story. There are some parallels between the 2006 and 2011 elections, as they both featured a shift in public opinion during the campaign. Using a weighted average of all polls with a reduction rate of 10% per day would bring about a very good result, with a cumulative error of 3.3 points at the national level. Any other reduction rate pushes the error to 3.9 or 4.0 points, which is still very acceptable over five parties.

But using only the final polls of the campaign would result in a cumulative error of 5.8 points. In other words, while a simple poll average would have been more accurate in 2011, the weighted poll averaging system that I use in my popular vote projection would have been more accurate in 2006. And of all these scenarios, using the average of the last five days of polling in the 2008 election is worst of all at a total error of 6.9 points.

If ThreeHundredEight is still around for the 2015 election, some sort of ballot box effect might have to be taken into account. Obviously, a ballot box effect at the federal level cannot be translated to the provincial level - the parties just aren't the same. But this daily weighting of 23% appears to give good results, and so will be used for the provincial projections this fall. Adjustments for polling error may also be investigated for each of the provincial elections and included in the model, while the track records of each of the pollsters will also be adjusted in the wake of this election.

On that note, I will be taking a detailed look at how the pollsters did (briefly hinted at here) next week.

UPDATE 06/05/11: Running the projection at 23% daily reduction, I would have projected 143 Conservatives, 102 New Democrats, 49 Liberals, and 14 Bloc seats. While it would have been a much better projection, it still would not have called the Conservative majority - just like the polls did not.