Tuesday, May 19, 2015

Methodological note on today's projection update

I've received questions and noticed some discussion concerning the weights applied to the newest polls added to the model. I wanted to take a moment to clarify things for readers.

There are three polls that have been conducted recently that have been added to the projection: EKOS/iPolitics (May 6-12), Forum/Toronto Star (May 12-13), and Nanos/CTV (April 17-May 14). EKOS has the largest sample at 2,177, while Forum surveyed 1,286 Canadians and Nanos a total of 1,000.

In terms of the weighting, EKOS and Forum have a similar weight while Nanos has a heavier weight in the projection. Some readers have wondered why that is, considering that Nanos's poll was taken over a period of four weeks, and so includes a lot of much older data than either the Forum or EKOS polls.

Three things are taken into account when weighting a poll: the sample size, the field dates, and the track record of the polling firm. Nanos has had a very successful track record (though it has participated in fewer elections than either Forum or EKOS, and who knows how Nanos would have done in Alberta in 2012 or British Columbia in 2013 had it participated), and so has a heavier weight.

The model weights each poll on a weekly basis, depending on where a poll lands in the model's weekly blocks of time. If a poll straddles two weekly blocks, it is considered to have been conducted in the block in which the majority of the polling took place. When there are multiple weeks that fit that definition, the most recent one is selected. For that reason, Nanos and EKOS are considered to have been conducted in the same weekly block ending on May 11, while Forum is in the most recent block ending on May 18.

The model is designed this way so as not to put too much importance on differences of a day or two between polls, since the election is so far away and shifts in voting intentions are slow to occur. When the election campaign begins, the model switches over to a daily weighting scheme.

But is it fair to consider the Nanos poll as recent as the EKOS poll, considering its field dates run back to mid-April? On the face of it, it isn't. This is a quirk of the model that is being exposed by Nanos's abnormally long polls (no one else is conducting polls taken over more than a week). During the election campaign, this would not be an issue as no pollster would release such a poll.

So why not consider Nanos an older poll? The guiding principle behind the model is uniformity and objectivity. All polls are handled by the same set of criteria, and no adjustments are made by me. This is what makes it a model in my view - once you stick your thumbs into it based on your gut or intuition, you are just making an educated guess.

I used to weight polls by their median date, rather than by the last day of polling. This was meant to reflect how some polls had older data in them than others, even if they finished polling on the same day. But this had a perverse effect. If I weighted these three polls by their median date, Forum would have been weighted for May 13, EKOS would have been weighted for May 9, and Nanos for May 1. Forum would be considered four days 'newer' than EKOS, despite EKOS leaving the field only one day before Forum did.

It would reward Forum's flash-polls taken over one or two days, and penalize polls conducted over (reasonable) longer periods. The Nanos example is an extreme, but the EKOS example is very relevant. EKOS is polling over a longer period, is less vulnerable to daily blips based on the news-of-the-day, and has the opportunity to call-back people who did not pick-up the first go around. Polls taken over a day or two have no such opportunity, and run a higher risk of being unrepresentative.

Because of this, I abandoned the median-date weighting. But because I apply things uniformly, the Nanos poll is treated as far newer by the model than common sense would dictate.

Five months (to the day) from the next election, this is not a big deal. During the election campaign, no such oddity would be very likely to occur again. In any case, it hasn't had much effect on the projection, which moves slowly this far out from the vote. If the Nanos poll had been weighted for May 1, the median date, the projection would be little different: 31.8% for the Conservatives, 30.3% for the Liberals, and 25% for the NDP.