Thursday, March 17, 2011

Poll field dates and election results

If there is going to be an election this spring, it will start next week. This has led me to prepare for the campaign, and one of the things I decided to look at was how to weigh polls by their field dates.

Currently, polls are weighed on a weekly basis, based on their final day of polling. For example, every poll with their final day of polling being between March 13 and March 19 (i.e., this week) would be weighted most heavily. As each week passes, the date weighting of each poll is reduced by 25%.

Obviously, with an election campaign lasting five weeks this would not do. So I took a look at 2008's campaign and tried to pinpoint how the time between the poll being conducted and the election taking place plays a role in the reliability of the poll.

Of course, this could be a bit of a fool's errand. The polls could be on the money throughout the campaign, and simply be tracking changing opinion rather than being wrong at the beginning and less wrong at the end. But I was hoping there would be some kind of pattern or trend, hinting at a way I could weigh polls by their dates.

The following charts track the average difference between national polling results for the five main parties and the final tally on October 14, the day of the 2008 election. I used the poll results from Wikipedia, but note that not all of them were dated according to the last polling date but instead on their release date.

I first started by tracking polls on a weekly basis. This is the system I currently use, so I wanted to see how useful it would be during a campaign.

Clearly, there is no pattern here. The polls did narrow in on their target in the final week of the campaign, but were still off by an average of 1.86 points. In fact, the weekly average was closest to the election result in Week 4 (September 24-30), when they were off by an average of 1.61 points.

So, weighing polls by week would not be the best option during a campaign. And it would cause the projection to react very slowly to changing trends.

Next, I looked at how the average polling results over blocks of three days compared to the final election results.

This was slightly better. On the last three days of the campaign, the polls were off by an average of 1.52 points, bettered only by the block of time between September 27 and 29, when the polls were off by 1.45 points.

You can clearly see how in the final six days of the campaign voting intentions solidified and the pollsters got closer to the election night's results. You can even see that between September 6 and September 29, the polls were getting increasingly close to the mark.

This late-September change was the point of the campaign where the gap between the Liberals and the Conservatives was at its widest. At this point, the Conservatives were polling in the high-30s and even at 40%, while the Liberals were polling the low-20s.

Nevertheless, I'm still not pleased with these three-day results. And, again, the projection would react slowly to shifts in support. With the last campaign averaging a little more than three polls per day (on several occasions there were FIVE polls per day), it is unlikely that daily tracking will suffer at the whim of margins of error and methodological differences. So, it would be best for ThreeHundredEight to weigh polls on a daily basis throughout the campaign.

And 2008's results demonstrate that would be best.

Over the last week, polling results likely echoed the voting intentions of Canadians as the last few undecideds and fence-sitters made their decisions. From being an average of 2.63 points off of the election's results on October 6, the pollsters got closer and closer as the week went on. On October 13, the polls were only 1.16 points off (the best result of the campaign), and over the last four days of the campaign they were averaging below two points of difference.

But how to age the polls as the campaign goes on? I want the projection to reflect the latest mood of the public, but keep it safe from the wild fluctuations that can sometimes take place but never last.

Contrary to what you might think, the best course of action appears to be to age the polls slowly, at the rate of about 5% per day. In 2008, all else being equal (sample size, pollster reliability) that would have given me an average margin of error of 1.1% for each party, though the NDP, Bloc, and Liberals would have been very close to the actual result. The Conservatives would have been underestimated by about 2.4 points, but that is more the fault of the pollsters, who did not seem to catch the quick Conservative rebound over the final weekend.

I will look at 2006's campaign as well to see if this weighting will work. It may depend on the style of the campaign. The last election was a relatively stable one. Polling intentions shifted but never dramatically. If the next campaign follows the same pattern, a conservative aging of polls will likely be the best bet. But if the next campaign resembles 2006, where the underdog ended up winning, I may need to change the weighting system mid-campaign in order to reflect the huge shift in voting intentions.