Thursday, January 12, 2012

Weighing the polls (again)

In the wake of the 2011 federal election, I took a look at what had been the Achilles' heel of my projection model at the time: the daily reduction in weight of a given poll. For the federal election, I reduced the weight of a poll by 7% per day. This was not aggressive enough to be able to adequately record the increase in New Democratic support in the final week of the campaign, throwing everything off.

Accordingly, after some analysis I decided to instead decrease the amount of weight given to a poll by 23% per day. This worked pretty well during the last set of provincial campaigns, enough so that it was not the cause of any major errors.

That does not mean that the daily weight reduction cannot be dialed in more accurately. However, polls in the five campaigns acted very differently from one province to the next. On the one hand, in Ontario, Manitoba, and (especially) Prince Edward Island, increasing the rate of reduction increased the degree of accuracy of the vote projection model. On the other hand, in Saskatchewan and Newfoundland and Labrador, decreasing the rate of reduction upped the accuracy.
The chart above records the total cumulative error of the projection compared to the vote results for major parties. This means that if the model is under-estimating the support of one party by two points and over-estimating the support of another by two points, that equates to a total error of four points. A daily rate of reduction of 0.75, for example, means a rate of reduction of 25% per day.

For the federal election, a rate of reduction of about 25% would have worked well. For Ontario, a reduction of around 40% or more would have been best, while 35%-40% for Manitoba would have yielded better results. For P.E.I., the accuracy increased with every increase of the rate of reduction, but for Newfoundland and Labrador and Saskatchewan it was the opposite.

What this chart shows is that over the six elections, a daily rate of reduction of between 30% and 40% would have yielded the best cumulative result of 43.7 points of total error. This indicates that, in the future, a daily reduction in this range would be best.

But how to decide? A reduction of 30% would have worked better in the federal, Newfoundland, and Saskatchewan campaigns, a reduction of 40% would have been the better option in Ontario, Manitoba, and Prince Edward Island.
What I've done in the chart above is rank each rate of reduction from one to six for each election. The lowest total when adding up these rankings, then, should tell us which rate of reduction would have had the better results across the board.

That turns out to be a reduction of 35% per day, or by a factor of 0.65. This is the top rate for Manitoba, and it ranks in the middle for every other election. Upping the rate of reduction to 40% would have been too much for the federal, Newfoundland, and Saskatchewan elections, while putting it to 30% would not have been the best result for anyone.

Moving forward, the vote projection model will be using this 35% daily reduction during a campaign. Outside of a campaign, I will be applying that 35% reduction on a weekly basis.

But that is only one part of the story. The next task is to figure out the difference between poll results and election results, so that the vote projection model can bridge the gap between the two. Stay tuned.


  1. "The next task is to figure out the difference between poll results and election results, so that the vote projection model can bridge the gap between the two. Stay tuned."

    The last time you looked at this issue you tried to average the variation per party over the '04, '06, and '08 cycles.

    This was counter productive in the case of the CPC since this change has been completely one directional over four cycles now and growing in power.

    Averaging the effect lower the strenght of the variation when it was in fact still increasing.

    People's thinking on this issue is fuzzy. They're claiming its a fluke. Or that the pattern over four cycles will suddely reverse itself.

    First we need to come up with some idea of what's causing it !

    What growing trend has been in place since 2004 that over time had gradually worked to the Conservatives advantage on election day relative to their polling numbers ?

    I can think of some possible candidates to consider:

    Changes in voter participation and demographics over time.

    An ever increasing organizational advantage paid for by total fundraising dominance by the CPC.

    A further issue when we look at polling is that we know voter intensity is greater amongst Conservatives (this may change watch the data).

    Polling that uses some sort of screen to weed out people unlikely to vote will probably be closer to EC day results than polling that simply dials random Canadians over 18.

  2. I will be looking at this from several angles, not just averaging the past error, and including the latest provincial elections.

  3. Eric I have a lot of respect for what you do here, but you may be chasing shadows in your attempt to predict election outcomes to a high degree of accuracy every time. One of the major factors in who wins is human emotion and that is hard to quantify mathmatically.
    For example, not many of us could have envisioned what was to happen the last few days of the recent federal election campaign. However, Jack Layton's brilliant campaign ended with the Orange Crush wave which threatened to sweep them close to, if not into power. Canny bugger that he is Harper pulled the "Red" card and appealed to right-leaning Liberals to stop the madness. Obviously that worked.
    This was all driven by emotion -- the excitement of the Orange wave countered somewhat by the fear of it.
    Good luck building a model to predict that.

  4. Pinkobme, it is never going to be able to predict to an extremely high level of precision, but a better handling of the polls in May 2011 would have had me projecting over 100 NDP seats. And if I had a model at the time that forecasted Tory support to out-perform the polls, I would have had them at a majority.

    I think I'm chasing something more substantial than shadows. It's more like being in a desert - what's real and what's a mirage?

  5. You're assuming that the past results will predict the future results though. In 2006 the polls under-represented Liberal support. In 2008 at 2011 they under-represented Tory support. How would your model accommodate that shift? Because if it shifts pack your model is screwed.

  6. A suggestion - perhaps it's better to look for gaps between the voting intentions of likely voters and all eligible voters, and identifying which pollsters screen for them appropriately? Or looking at pollster methodologies? That way you might catch changes in how the polls may be biased before the election proves the polls wrong.

  7. As I said, I'm planning to look at it from many different angles!

  8. Ryan I see no reason to predict things will shift back from the growing, one direction trend over four cycles of polls underestimating the CPC.


    Because of news stories like this:

    "Liberals Sound Alarm On Widening Fundraising Gap With Tories"

    Throw in the dissapearing public subsidy and a long prospect of rebuilding a dysfunctional organization that isn't providing the GOTV the CPC has and you're not looking at this going away any time soon.

    Not that it will never change.

    If the Liberals can rebuild i'd expect that the cycle after next they'll be a better better at turning potential support in the polls into actual voters on election day.


COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.

Note: Only a member of this blog may post a comment.