Friday, January 30, 2015

What are the polls really showing in Sudbury?

As by-elections go, the one in the provincial riding of Sudbury in Ontario has been pretty dramatic. Accusations of skulduggery, a Liberal-turned-independent candidate mixing it up, and a floor-crosser that managed to ford both the ideological gap between two parties as well as the federal/provincial line.

Some of the polls have been showing a close race between Liberal candidate Glenn Thibeault and the NDP's Suzanne Shawbonquit, with independent Andrew Olivier earning a big chunk of the vote. Other polls have shown Thibeault with a relatively comfortable lead. What gives?

At first glance, it is very puzzling that the polls by Forum Research and Oraclepoll Research can differ so much. Both firms have been in the field twice at exactly the same time. Oraclepoll was in for three to four days, with the Forum polls right in the middle of those sampling periods. So timing should have nothing to do with it.

Forum's first poll on January 13 put the gap at just two points, with Shawbonquit narrowly ahead. Olivier scored just 1%. Then on January 21, the gap was three points for Thibeault as he dropped seven points and Shawbonquit fell 12. Olivier made a miraculous 21-point gain, which Forum called a surge.

Meanwhile, Oraclepoll gave Thibeault a lead of 13 points on January 12-15 at 39% to 26% for Shawbonquit, with Olivier at 19%. On January 20-22, the gap was 16 points (42% to 26%), with Olivier at 20%. In other words, while Forum was showing a major shift, Oraclepoll was showing stability.

So who's right?

There are many reasons to trust Oraclepoll's numbers more than Forum's. Here's why.

Firstly, Forum is based out of Toronto and Oraclepoll is based out of Sudbury. Right out of the gate, the local firm has an advantage both in terms of local knowledge and a local area code showing up on the caller ID.

Secondly, and most importantly, Oraclepoll is conducting a superior survey. Forum is doing its polls via IVR, as it always does. It jumps into the field for a few hours on one evening, gathers its 500 to 800 responses, and reports the numbers a day or two later (response rates being what they are for Forum, it probably needs to call 25,000 to 80,000 households to get even those small samples).

Though it doesn't say explicitly, we can assume that Forum is not calling cell phones. This is because, unlike land lines, it is not possible to know for certain where a respondent on a cell phone is living. This is a problem in by-elections, as ridings do not entirely occupy an area code. This is not an issue in provincial or national polling. Cell phones can be included there without issue, and usually are.

Oraclepoll, on the other hand, is doing its polls with live-callers and it explicitly says that it is including cell phones in its sample (perhaps it has a list of local cell phones to call from). It is also in the field for several days, and according to its press release is calling back numbers where respondents did not answer up to five times before giving up. That is how a poll is supposed to be done.

Forum is getting responses from whoever is home between, say, 7 PM and 9 PM on January 13 and January 21 and is willing to answer a survey. Oraclepoll is doing everything that is reasonably possible to reach everyone it is calling. The potential for a biased sample is far lower.

Lastly, Forum's 1% result for Olivier on January 13 defies logic. At the time the poll was released, people were very surprised that Olivier, who had a relatively high profile, was scoring just 1%. That his number jumped from 1% to 22% in eight days makes little sense, particularly when Oraclepoll was showing his support levels to be steady at between 19% and 20% over the same period. While it is conceivable that the discrepancies in the results for the Liberals and NDP can be explained away by sampling issues and statistical probability, the discrepancy for Olivier's numbers are virtually impossible.

I don't know what happened in Forum's January 13th poll. But it simply doesn't make sense.

In that survey, Forum inquired as to whether Sudburians (Sudburites? Sudburers?) approved of the candidates on offer. The responses there just don't line up with the voting intentions numbers it recorded.

At the time, the poll found that 95% of respondents were aware or had an opinion of Thibeault, compared to 75% for PC candidate Paula Peroni and just 61% for Shawbonquit. But Olivier, who had just 1% support in the same poll, had 88% recognition - better than either the NDP or PC candidates.

That poll also suggested that 48% of all respondents (including undecideds and those who did not know him) approved of Thibeault, indicating that he was converting about 80% of sympathizers into supporters. Shawbonquit, with 41% approval, was converting an incredible 99% of sympathizers.

Peroni's approval was just 28%, as she was unknown to many respondents, but she was converting just 45% of sympathizers into supporters.

Olivier, though, had an approval rating of 60% of all of those sampled - and this is including those who did not have an opinion or did not know him. That was higher than any other candidate. According to the poll, we are thus supposed to believe that Olivier was converting just 2% of his sympathizers into supporters. If you can believe that, I have a bridge that spans the width of Lake Huron to sell you.

I think we can reasonably conclude that the January 13 poll by Forum can be tossed aside and discarded. Forum's January 21 poll has no similar problems, and support levels for Olivier and Peroni are similar to those recorded by Oraclepoll. Considering the sample sizes, Shawbonquit's support is also within the margin of error of these two polls, and Thibeault's is only slightly outside of it.

Due to the advantages that Oraclepoll's survey has over Forum's (not to mention the poor record Forum has in by-elections outside of Toronto), the benefit of the doubt should probably go to Oraclepoll. And that's even with the smaller sample size (the margin of error of decided voters would be just under +/- 6%).

If we use that margin of error to estimate support ranges, we'd get Thibeault at between 36% and 47%, Shawbonquit at between 21% and 31%, Olivier between 16% and 25%, and Peroni between 5% and 11%. That is probably as close as we can get to the truth at this stage of the campaign, which will come to a merciful end on Thursday.

Wednesday, January 28, 2015

Breaking down the seat numbers

The current federal projection puts both the Liberals and Conservatives in range of winning a plurality of seats, but neither appears capable to assemble the numbers for a majority government at this point. Let's take a closer look at this.

First, though, I invite you to check out my article today for the CBC. I've delved more deeply into the numbers for British Columbia, an interesting three-way race with multiple sub-regional contests, and Alberta, where some surprises could be in store for 2015.

Despite being behind in the vote projection by 1.3 points, the Conservatives are currently being awarded 136 seats. That puts them 34 seats short of an outright majority (169 seats is the threshold for 50%, but unless a sitting government wants to have the Speaker be from one of the opposition parties, 170 seats are needed).

Conservative ranges
If we look more broadly at the likely ranges, the Conservatives have between 117 and 155 seats. That still puts them 15 short of a majority government, a big gap to overcome. The polls would need to be off by a fair bit, and a lot of the close races would have to fall in their favour, for a majority government to be won at this stage of the game.

The chart on the left breaks down the Conservative ranges over the last three weeks. I've removed the Liberals and NDP ranges for clarity. Below I've done the same for the Liberal and NDP numbers.

If we extend things even further to the maximum ranges, the Conservatives are in play in 189 seats. That gives them a cushion of 19 for a majority government, but would require the polls to be significantly off the mark. The maximum projection, for instance, is based on national support of just over 41%.

The Conservatives' hopes of a majority government really come down to Ontario and British Columbia. Even in the best of circumstances, where the party takes 41% of the vote, their seat total would only be increased by 16 outside of these two provinces. That isn't enough. Instead, it is British Columbia, where they could boost their haul by nine seats, and particularly Ontario, where they could take 24 more seats, where a majority government will be won or lost.

However, that is not the situation the party currently finds itself in, well short of a majority and neck-and-neck with the Liberals in the seat ranges. Even if the party does manage to end up at the high end of their current range, or 155 seats, the Liberals and New Democrats would likely end up with a majority between them. Would a Conservative minority government really last very long?

Liberal ranges
The Liberals, currently at 126, are well short of a majority and are even short of the Conservatives for a plurality. But their likely range puts them between 107 and 144 seats, and so capable of placing first in the seat count. A majority government does not appear in the cards at the moment - the maximum for the party is 165, five short of the mark.

For the Liberals, whether they end up on the high end of the range depends almost entirely on Ontario and Quebec. They are currently close to their maximum in British Columbia, Alberta, the Prairies, and Atlantic Canada. Even if the party ends up at the maximum range of their likely support, they can only pick-up six extra seats in these regions.

These regions are actually more in danger of being disappointing. If things go badly, they could finish with 18 fewer seats in these regions than they are currently slated to take.

Ontario is the big prize, though, with the current range being between 45 and 63 seats. That accounts for roughly half of their national range.

There is slightly less potential for a breakthrough in Quebec, but it is nevertheless important as the party currently sits at between 20 and 30 seats there. The maximum goes up to 35. What this means is that if the Liberals are slightly under-performing in Ontario and Quebec, they cannot make up those losses with better performances in the rest of the country. Their election chances live and die in Ontario and Quebec.

For the New Democrats, placing first or second is currently not envisioned. Their overall take is slated at 72 seats at the moment. While that is down 31 seats from what they managed in 2011, it would still rank as their second-best performance in party history. But their sights are set higher than that.

NDP ranges
The ranges suggest the party is unlikely to return to pre-2011 levels, with between 54 and 87 seats. The lower end of the scale, however, would lead to more gains for the Bloc Québécois. This has the potential to complicate the arithmetic in a minority legislature. With the Bloc at a half-dozen seats or less, the NDP and Liberals can form a majority on their own. Once the party starts creeping closer to 10 seats, that becomes less likely. And the only route for a stronger Bloc is through the NDP.

The maximum ranges for the party do suggest it could end up with as few as 30 seats, but that assumes a major collapse, particularly in Quebec. Their higher range of 109 seats is more interesting, as it would likely lead them to finishing second in the seat count. Though the NDP's highest range is higher than the lowest range for the Liberals and Conservatives, it is not conceivable that all three can occur at the same time. If the NDP is as high as 26%, for instance, the Liberals are likely down quite a bit - which means the Conservatives win more seats. At this stage, an outright NDP victory does not seem plausible.

Quebec really is where everything will be decided for the NDP. Stronger performances outside of the province would only add six seats to the current projection of 72. A strong performance in Quebec could add nine seats alone (or 16 if things go very well). A poor performance in Quebec could drop the party by nine seats. A really bad performance could drop them by 22.

Ontario is also important for the NDP, but more in terms of avoiding a serious drop in seats. They are considered to be in play in just 18 ridings there at the moment, but could drop to eight with even a slight under-performance. Ontario is more about salvaging things for the NDP than making gains.

That about sums up the lay of the land at this point, with the likely ranges serving as a good guide to where things could potentially go in the coming months if no major shifts take place. The maximum ranges tell us what could happen if those major shifts take place. If that happens, of course, the numbers will be re-calculated to see where things could go from that new vantage point. It will be an interesting year.

Sunday, January 25, 2015

Introducing the 2015 federal election projection model

It is hard to believe that 2015 is now finally upon us. The 2011 federal election seems like eons ago. At the time, the prospect of a majority government, after seven years in which four elections were held, seemed like an endless expanse. But we're now finally in the election year, so it is time to get's projection model up and running.

I'll get into analyzing the numbers themselves in the coming days. For now, you can click on the link above or the table above to see the breakdown. For now, let me go over the features of the model, the changes that have been made, and how we got here.

What's been learned since 2011

The roots of the model that will be used for the 2015 federal election, and which has been used in 12 provincial elections that have occurred from coast-to-coast, was first employed in the 2011 federal vote. In some ways, that election was a successful proof of concept. In other ways, it was a failure.

Those who doubt the usefulness of projection models and other critics will undoubtedly point to that failure in 2011. They would not be wrong in doing so, but it would be dishonest. Why? Because the projection model worked very well in 2011.

At least, the seat projection model did. After the election was over, I plugged the results in each region of the country into the model, as if the polls had nailed the call. The outcome was exceedingly close to what actually happened:

The model would have missed the Conservative count by just five seats, and pegged the Liberals and New Democrats to within three. Even the huge breakthrough in Quebec for the NDP would not have thrown the model for a loop - the New Democrats would have been awarded 60 seats and the Bloc Québécois just four (the actual result was 59 and four, respectively).

So the seat projection model proved its ability to translate regional support levels into accurate numbers of seats. This has been proven again and again in provincial elections since. But the call in 2011 was nevertheless missed.

There were two reasons for that. The first was entirely my fault. At the time, I had been basing my model on past elections like those in 2006 and 2008. In each of those elections, the polls in the last weeks of the campaign hardly budged. The swing that occurred over the holidays in 2005-2006 had settled in for some time by the end of that campaign. With those elections as my guide, I calculated that I needed to decay the weight of a poll by just 7% with each passing day.

That was a grave error. It is a system that would have worked well in 2006 or 2008, but utterly failed to capture the late swing away from the Liberals and towards the New Democrats that occurred in 2011. What happened was that, by election day, the projection model was still roughly one week behind where it should have been. Accordingly, it projected a Conservative minority government with 78 seats going to the NDP, 60 to the Liberals, and 27 to the Bloc.

After that disaster, I changed the way the weighting system worked and increased the decay to 35% per day. This has been employed in the past dozen election campaigns and has had no trouble capturing the shifting mood of the public. But that is, of course, if the polls capture it as well. And that is the second reason why 2011 did not go well.

The polls did a moderately good job of things in 2011. They were on the money in Quebec, where the electorate had shifted so dramatically. But at the national level, the polls missed out on the Conservative majority government. With the system in place now, the vote projection model would have given the Conservatives 36% of the vote, instead of the 39.6% they actually got. Both the NDP (31.5% instead of 30.6%) and the Liberals (20.1% instead of 18.9%) would have been over-estimated. The seat projection would have put the Conservatives about a dozen seats shy of a majority government.

This is why, starting with the 2012 Alberta provincial election, I developed a system to estimate likely ranges of support. This system takes into account the potential for the polls to be wrong, and is reflected in the table at the top of the page by the low/high and minimum/maximum ranges. Nineteen times out of 20, the result should fall within the maximum/minimum ranges. There are varying levels of likelihood of the results falling within the other ranges, as explained on the methodological page.

These ranges allowed me to be one of the only people to consider the possibility that Wildrose would not win that 2012 vote.

Starting in 2013, I added another element to the projection model. I found that, too often, people were taking the projected results in individual ridings too literally, as if they were a real indication of support. For that reason I have long thought about not publishing riding projections. But I think it is important to 'show my work', and that is what the riding projections do. They also include ranges that should give people a decent idea of who is and who isn't in play in a given riding.

Still, I wanted to make sure people considered the potential for error. I added probabilities to the projection model, measuring the likelihood that the call the projection model makes in each individual riding will be right.

To take London North Centre, where the Liberals held their two-day caucus meeting last week, as an example, the model currently says that the Liberals would win it with between 43% and 50% of the vote. The Conservatives, at 29% to 34%, are apparently not in play. But the model also says that the Liberals have an 82% chance of winning it. That means that, despite the ranges projected by the model, there is still a roughly 1-in-5 chance that the Liberals will not win the riding.

These probabilities were the last major addition made to the basic projection model, which has been employed with very few changes in the most recent provincial elections in British Columbia, Nova Scotia, Quebec, Ontario, and New Brunswick. A few new minor features, however, have been added for 2015.

What's new in 2015

While none of the basic mechanisms of the model have changed (see the methodological page for complete details on how it works), there are some differences between the 2015 model and the model that was used in prior elections.

First is something that is gone. In past elections, I've applied a "floor-crosser" factor in ridings where a sitting MP has crossed the floor to run with another party. It has had a mixed record. In some cases, the "floor-crosser" factor was quite effective. In others, it was greatly off the mark. I have decided to drop it as a factor that is taken into account. In its place, the party losing a floor-crosser suffers a 'no incumbent' penalty, and the party picking up the floor-crosser gets a 'star candidate' bonus. In past elections, this would have worked better than the system that had been used.

A new factor that has been added to the model, however, takes into account party leaders. My research has suggested that leaders are far more difficult to defeat than normal incumbents (though it can always still happen, of course) and that an MP running as a leader for the first time perform especially better than other incumbents. The loss of a leader as a candidate in a given riding, in addition, is far more penalizing than the loss of the average MP. So, for example, Gilles Duceppe's old riding will be harder to win back for the Bloc Québécois than the average riding they lost in 2011.

The particularities of the 2015 election

As always, an election has its own particularities that need to be taken into account. The idea behind the projection model is that it can be applied uniformly in all jurisdictions in Canada. But the fact is there are always oddities in individual elections that have to be considered.

The new federal riding boundaries is one thing that will make the election potentially harder to call than would have otherwise been the case. But the model has dealt with new boundaries before. In the 2011 Manitoba provincial election, for example, 56 of 57 ridings were called correctly despite the shifting boundaries. For this election, the model merely uses the transposed results that were calculated by Elections Canada. MPs that have shifted ridings are treated as incumbents. Ridings with no incumbents are left unadjusted by the incumbency factor.

The presence of Forces et Démocratie complicates matters. Normally, new parties have been handled by the model without much difficulty. Their support is awarded equally in all ridings within a given region (with the correct regional results, the model would have been within one seat of Wildrose's result in 2012, despite the party fielding candidates for the first time in many ridings).

But no pollster has included FeD in its surveys yet, so we cannot gauge their support. What I have done, until better data becomes available, is to treat Jean-François Larose in Repentigny as I would any sitting MP running as an independent (he retains a portion of his 2011 vote). For Jean-François Fortin, I have simply decided to adjust his 2011 support levels with the Bloc exactly as Jean-Martin Aussant's PQ support in the 2012 election shifted when he led Option Nationale. He seems like an appropriate example to use as a guide.

The floor-crossing of Bruce Hyer from the NDP to the Greens (via a stint as an independent) is also hard to model. Floor-crossing between the 'establishment' parties is relatively straight forward. Not so when it is the Greens, because their base in Hyer's riding was so small in 2011. Treating him like any other floor-crosser would, I think, under-state his likely support. What I've done for him, then, is to adjust his support levels in the same way that Blair Wilson's support shifted when he crossed to the Greens before the 2008 election. Hopefully a riding poll will be done in the riding to give us a better idea of what is going on, as I fear the system in place is also inadequate.

The last particularity of the 2015 election is the presence of an independent candidate like Inky Mark. In 2011, the model did not make any adjustments for the independent candidacy of Hec Clouthier, a former Liberal MP, in Renfrew-Nipissing-Pembroke, who captured a respectable share of the vote. Mark used to be an MP for the riding of Dauphin-Swan River-Marquette (now known as Dauphin-Swan River-Neepawa) and is running as an independent. So, based on past cases of former MPs attempting a comeback as an independent, Mark has been awarded a portion of the support he had in the last election.

Don't miss the forest

As always, I urge readers to exercise caution when looking at the projections. Consider the ranges very carefully. Don't take the individual riding projections as fact. The model simply cannot take into account the effects of local issues in all 338 ridings, and some will be wrong.

Take, for example, the riding of Outremont. It is currently projected to go Liberal, with the NDP in range of winning it. That is Thomas Mulcair's riding. Do I think he is actually going to lose it? Not for a second. But the model does, simply because of the gains the Liberals have made province wide in Quebec.

And that brings up something I am concerned about in the 2015 election. Polls have shown strong gains for the Liberals in Quebec. They have roughly doubled their support in the province since 2011. The model, then, doubles their support in every riding. That means the party is in a good position to sweep Montreal, where they had a decent base in 2011.

But regional polling by CROP has shown disproportionate growth for the Liberals outside of Montreal, meaning that the model may be inflating the party's numbers too much in Montreal and not enough outside of it. This is why, for example, it could be very wrong about Outremont.

The overall seat count for the province, however, is likely to be accurate. Using the polls from CROP, I made projections for the province at the sub-regional level. The end result was not much different. At the riding level, there would be some important differences - more Liberal wins outside of Montreal, fewer around the city. But the sum total of the seat count remains the same - particularly when we consider the ranges. This is why I beg readers not to miss the forest for the trees. The individual riding calls are not nearly as important as the overall regional and national projections.

You might wonder why I don't adjust the model to project sub-regionally in Quebec. The issue with that is that no other pollster is releasing sub-regional numbers for the province. And in 2011, CROP exited the field over a week before the end of the campaign. It would seem unwise to turn the Quebec portion of the model into a sub-regional one, when in all likelihood I won't have information to plug into it.

That about covers everything. I reserve the right to make some adjustments between now and the start of the campaign, though I don't suspect anything but tiny tweaks (perhaps to the values of the 'factors', for example, if I have time to plug some more data points into my calculations). If you notice anything that looks fishy, or even typos, please do alert me to them as errors could have crept in.

Though the model is fully automated, the graphics aren't. They can take quite some time to pull together, so at this stage I'm not sure how frequently I will update the projection. At least weekly, certainly, but updates with every individual poll, as I was doing for the running poll averages, may be too much. And just to make things clear: no, I am not changing the name of the site to ThreeThirtyEight or some variant of that. Consider the name of the site as a historical monument to when it was launched, during an era of seemingly endless minority governments.

Are we heading back to that this year? The current projection, with the Conservatives at between 117 and 155 seats and the Liberals between 107 and 144, would seem to say yes. But more on that later this week.