Friday, July 18, 2014

The alternate and truer history of the Ontario provincial election

The conventional wisdom is that the polls did not do a very good job of predicting the result of the Ontario provincial election last month. But that is not entirely true. The likely voter models did a very poor job of estimating what the result would be. The polls of all eligible voters - the numbers that in the past have been the only ones reported - did a decent job.

But how did those likely voter polls influence the narrative of the campaign? What was their role in creating the surprise of election night?

To answer these questions, I re-ran my projection model for the entirety of the election without using likely voter numbers. At the time, I had favoured likely voter results when they were available. This time, I've used only the eligible voter numbers. The chart below shows how the two projections would have stacked up against one another. (Note that throughout this post, the dates refer to the final day of polling that would have been included in the model, rather than the date of publishing.)

 It makes for quite a different story. The difference would have been felt as early as May 9, when the first Ipsos Reid likely voter poll was published. Instead of the wide lead the Tories would have been given - one that they would hold until May 16, the PCs would have been given a narrow lead of just under four points. They would have lost that lead as soon as May 12, regaining it again on May 14, and then losing it - never to have it again - on May 15.

That is quite a different story than the one the likely voter polls told. Instead of relinquishing the lead for good on May 15, the Tories had been awarded the lead for the first three weeks of the campaign. They were given the lead again, though briefly, on May 21, May 27, May 29, and were tied with the Liberals on June 6. But according to the aggregate of the eligible voter polls, the PCs never held the lead in the last four weeks of the campaign.

It is interesting to note that the proposed plan to cut 100,000 jobs, which has been credited as the key moment that cost Tim Hudak the election, was announced on May 9, only a few days before the PCs irrevocably lost the lead.

So instead of a yo-yoing, too-close-to-call campaign, the Liberals would have been considered to be in a good position from mid-May straight through to election day. The Liberal lead would not have been large, but with the exception of the post-debate period, the PCs would never have been pegged to have more than 34.9% support after May 14, with the Liberals never below 35.3% after May 16.

The tightening that occurred after the debate (the news about the police interviewing Dalton McGuinty might have had more of an impact) would have still been recorded in the eligible voter polls, but instead of the close race persisting in the last days, the Liberals would have moved comfortably ahead again on June 9.

That the Liberals took 38.7% of the vote would have still been a bit of a surprise, as the aggregate would have never given the party more than 37.8% of the vote, except on May 20. But the PC fall to just 31.3% would not have been too much of a surprise. Over the last three days of polling, the PCs were steadily falling, from 35.4% on June 9 to 33.8% on June 10 and finally 33.1% on June 11. From there, the drop of 1.8 points is easily explained.

Instead, the PCs were pegged to be far higher. Dropping, yes, from 39.2% on June 8 to 36.9% on June 11, but that is a far cry from the 31.3% they achieved on election night. We were expecting a close race. If we had been ignoring the likely voter numbers, we would have expected a Liberal victory - the only question being whether it would have been a minority or a majority (the projection would have had a likely range of 46 to 56 seats for the OLP).

The New Democrats were, for the most part, under-scored throughout the campaign due to the likely voter models. The difference was negligible to the end of May, but afterwards the effect was more significant. The trend lines would have been mostly the same, with the New Democrats dropping steadily from May 31 to June 5, but a big jump would have been recorded from June 5 to June 7, instead of the slump that the likely voter models suggested.

From June 4 to June 9, the likely voter model aggregate pegged the NDP to be under 20% support, bottoming out at 17.6% on June 8. Without those models, the projection would have only once put the party under 20% (on June 5). For the vast majority of the campaign, the New Democrats would have been estimated to within a few points of their final result. The performance of the eligible voter polls, the aggregate at least, would have been right on the mark on election night: 23.9% instead of the actual 23.8%.

Going forward, I intend to focus primarily on the eligible voter numbers in order to try to tell the story of what the polls are saying is the current state of opinion, rather than what the pollsters are guessing will happen. That is what these likely voter models are in the end, a guess, as no one can estimate with certainty how humans will behave in the future. The huge variation in approaches that each pollster took was testament to that.

I will be recording the results of these models in future elections to provide a bit of a clue to readers as to how turnout might look, but will be keeping the focus on the main numbers reported by the polls until it is proven with consistency that these models will work in Canada. In the Ontario election, the likely voter models were more misleading than informative. This time, a look at the eligible voter polls, like the one above, seems to have told the story more truthfully.

7 comments:

  1. Rather than predicting percentage of votes, polling firms should predict the number of votes. If half of Party A's voters unexpectedly decide to stay home, Party A's vote share will be lower than forecast, but Party B and Party C's share will go up, yet the forecasts for A and B could be exactly right. You can't evaluate the models properly with percentages.

    ReplyDelete
  2. Very good analysis in all your posts. This sort of look at polling data long overdue. Thanks for providing this info. Will repost via my Twitter feed often.

    ReplyDelete
  3. In fact, when the likely voter models are broken down by pollster, you find some (i.e. Ipsos) whose estimates were consistenly far off. Why is an interesting question, but the only relevant fact is that the polling averages would have been even more accurate without Ipsos Reid's numbers fouling them up.

    ReplyDelete
  4. I agree that "no one can estimate with certainty how humans will behave in the future". I also agree that likely voter polls are a bit like guesswork, in the sense I understand you are using the term.

    Thing is though, I think that's really true of all polls. In all polls, you're "guessing" how someone will vote, and then in likely voter polls you're guessing again as to whether they will in fact do so.

    I suppose it makes sense that one layer of guesswork would be more reliable than two.

    ReplyDelete
    Replies
    1. In polls you ask someone how they will vote-you're not guessing anything. The pollster takes the response as fact. In "likely voter" polls you ask how a person will vote then estimate or "guess" the turnout based on subsequent responses or the likelihood a person will vote based on subsequent responses.

      Delete
  5. And yet, in the end, the polls are all driven by the activities of the political players. In this particular Ontario election I would say the whole scene was very unexpectedly twisted by the activities of one leader and party. In fact I would suggest that the sudden and unexpected twist and shift by one of the major parties essentially threw all the baselines out the window and was a major cause of the poll inaccuracy.

    ReplyDelete
  6. Nice work. I'm sure the Ontario and Quebec elections will yield more insights.

    ReplyDelete

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.