Thursday, March 17, 2011

Poll field dates and election results

If there is going to be an election this spring, it will start next week. This has led me to prepare for the campaign, and one of the things I decided to look at was how to weigh polls by their field dates.

Currently, polls are weighed on a weekly basis, based on their final day of polling. For example, every poll with their final day of polling being between March 13 and March 19 (i.e., this week) would be weighted most heavily. As each week passes, the date weighting of each poll is reduced by 25%.

Obviously, with an election campaign lasting five weeks this would not do. So I took a look at 2008's campaign and tried to pinpoint how the time between the poll being conducted and the election taking place plays a role in the reliability of the poll.

Of course, this could be a bit of a fool's errand. The polls could be on the money throughout the campaign, and simply be tracking changing opinion rather than being wrong at the beginning and less wrong at the end. But I was hoping there would be some kind of pattern or trend, hinting at a way I could weigh polls by their dates.

The following charts track the average difference between national polling results for the five main parties and the final tally on October 14, the day of the 2008 election. I used the poll results from Wikipedia, but note that not all of them were dated according to the last polling date but instead on their release date.

I first started by tracking polls on a weekly basis. This is the system I currently use, so I wanted to see how useful it would be during a campaign.

Clearly, there is no pattern here. The polls did narrow in on their target in the final week of the campaign, but were still off by an average of 1.86 points. In fact, the weekly average was closest to the election result in Week 4 (September 24-30), when they were off by an average of 1.61 points.

So, weighing polls by week would not be the best option during a campaign. And it would cause the projection to react very slowly to changing trends.

Next, I looked at how the average polling results over blocks of three days compared to the final election results.

This was slightly better. On the last three days of the campaign, the polls were off by an average of 1.52 points, bettered only by the block of time between September 27 and 29, when the polls were off by 1.45 points.

You can clearly see how in the final six days of the campaign voting intentions solidified and the pollsters got closer to the election night's results. You can even see that between September 6 and September 29, the polls were getting increasingly close to the mark.

This late-September change was the point of the campaign where the gap between the Liberals and the Conservatives was at its widest. At this point, the Conservatives were polling in the high-30s and even at 40%, while the Liberals were polling the low-20s.

Nevertheless, I'm still not pleased with these three-day results. And, again, the projection would react slowly to shifts in support. With the last campaign averaging a little more than three polls per day (on several occasions there were FIVE polls per day), it is unlikely that daily tracking will suffer at the whim of margins of error and methodological differences. So, it would be best for ThreeHundredEight to weigh polls on a daily basis throughout the campaign.

And 2008's results demonstrate that would be best.

Over the last week, polling results likely echoed the voting intentions of Canadians as the last few undecideds and fence-sitters made their decisions. From being an average of 2.63 points off of the election's results on October 6, the pollsters got closer and closer as the week went on. On October 13, the polls were only 1.16 points off (the best result of the campaign), and over the last four days of the campaign they were averaging below two points of difference.

But how to age the polls as the campaign goes on? I want the projection to reflect the latest mood of the public, but keep it safe from the wild fluctuations that can sometimes take place but never last.

Contrary to what you might think, the best course of action appears to be to age the polls slowly, at the rate of about 5% per day. In 2008, all else being equal (sample size, pollster reliability) that would have given me an average margin of error of 1.1% for each party, though the NDP, Bloc, and Liberals would have been very close to the actual result. The Conservatives would have been underestimated by about 2.4 points, but that is more the fault of the pollsters, who did not seem to catch the quick Conservative rebound over the final weekend.

I will look at 2006's campaign as well to see if this weighting will work. It may depend on the style of the campaign. The last election was a relatively stable one. Polling intentions shifted but never dramatically. If the next campaign follows the same pattern, a conservative aging of polls will likely be the best bet. But if the next campaign resembles 2006, where the underdog ended up winning, I may need to change the weighting system mid-campaign in order to reflect the huge shift in voting intentions.

12 comments:

  1. In 08 we had Dion's 'I dont understand the question' comment that blew it for him and in 06 it was the 'sticky fingered Liberals' who shot themselves in the foot. This time we have a Liberal leader who appears elitist, arrogant and doesnt even want to be here. Result Tory win but Im saying there wont be an election as the NDP has too much to lose at the moment.

    ReplyDelete
  2. Pollster.com had some very good methods for weighting polls by recency in the last US presidential race. I'm sure you're familiar with their work. (They're now owned by HuffPo; not sure that's a good thing.)

    They had a different set of challenges than you of course. I like best/worst case scenarios, which you each produce in your own ways.

    Also, tracking at the provincial level really helps get a sense of the race. I remember Pollster -- and pollsters -- hardly looked at non-swing states. In your case, fluctuations in Alberta for example mean almost nothing. This race is about Ontario and BC, in my humble opinion, so more time and emphasis on those numbers is what I'd like to see.

    Thanks again for this amazing blog. Gets better all the time.

    ReplyDelete
  3. The other issue to keep in mind is the individual methodology. I know some pollsters were doing daily polling with equal completes daily, but reporting a rolling window rather than daily results.

    ReplyDelete
  4. @11:26 Anon

    In 2008, we also had Harper shoot himself in the foot in Quebec with the funding for the arts comments, handing significant momentum to the BQ late in the race.

    ReplyDelete
  5. I'm not sure the uptick for the Cons in the last weekend of '08 was a result of any change in voter sentiment. They have a *very* effective Get-Out-The-Vote operation, and the strong finish might just be a reflection of that. On the other side of the coin, in most ridings there is virtually no Green GOTV operation, so all those folks who tell pollsters (and party canvassers) that they would vote green on e-day don't get dragged to the polls by volunteers, depressing their numbers by a significant margin.

    Never underestimate the power of a strong GOTV operation. The Libs had it all through the Chretien years, and it fell apart when the party divided itself during the Martin years. Volunteers were less than enthusiastic about Dion, and on the other side the military precision of the Conservative voter-tracking system and their fascist command and control structure works to their significant advantage on e-day.

    ReplyDelete
  6. Anonymous 11:44,

    Yes, I will be keeping an eye on that to avoid having the same daily results in my model more than once.

    ReplyDelete
  7. "I used the poll results from Wikipedia, but note that not all of them were dated according to the last polling date but instead on their release date."

    Sorry about that. I wasn't as vigilant at maintaining that chart as I am now. The 2011 election opinion polls chart has no such errors (well, it has one - there was a poll reported in the media but not officially released, so the polling date there is a guess, but it's not the release date).

    ReplyDelete
  8. Anonymous....


    This time we have a corrupt dishonest Harper, who is responsible for all the sleeze that is now obvoius to all.

    Gradually, this will were down his lead and he will be finished.

    ReplyDelete
  9. I think you need to be consistent. Polls are much better indicators of trends than actual results. if you change your model you will destroy the ability to see what polls are best at. There is always a margin of error.

    ReplyDelete
  10. Its interesting to look at the daily tracking that was done by Nanos during the 05/06 campaign - Dec. 22 - three and half weeks into the campaign - despite all the unceasing bad publicity about liberal corruption etc... the Liberals were still 10 points ahead and the polls had been remarkably stable for over three weeks. The NDP was stuck 9in the low teens and the Tories were in single digits in Quebec...then look at what the resuilts were in January '06

    http://www.sesresearch.com/election/SES%20CPAC%20December%2022%202005E.pdf

    ReplyDelete
  11. Are those differences in asbolute terms? Because if it isn't, it's possible the average margin of errors became narrower, but only because the errors cancel out each each other (like the Tories were projecting 2 points too high but the Liberals 2 points too-low.

    ReplyDelete
  12. No, the errors don't cancel each other out. Whether the poll result was above or below was not recorded, only the amount that the poll average was off.

    For example, the results were (CPC, LPC, NDP, BQ, GPC) 37.7%, 26.3%, 18.2%, 10.0%, 6.8%.

    If the poll average was 37%, 27%, 19%, 10%, and 7%, the total error was 2.4, an average of 0.48 for the five parties.

    ReplyDelete

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.