Tuesday, August 13, 2013

The new projection model for the upcoming Nova Scotia election

The next provincial election likely to occur will be in Nova Scotia. Premier Darrell Dexter needs to call the vote by the spring of 2014, but all indications are that he will drop the writ some time in the late summer or early fall. Accordingly, ThreeHundredEight is today launching the Nova Scotia projection.

Unfortunately, there isn't a lot of polling data to go on just yet. The most recent numbers we have for the province date to the end of May, when Corporate Research Associates was last in the field for their quarterly polling. They should be out with the numbers from their next quarterly poll in early September.

The projection model has been given a few tweaks, as is always the case after each election - and particularly after such a humbling election as the one in British Columbia. One thing that has not been tweaked, however, is the seat projection model or the way that polls are weighed. The seat projection model has, time and again, shown itself to be a useful tool if the numbers plugged into it are accurate. That means the challenge is not to estimate the number of seats each party will win, but rather what proportion of the vote they will get.

A detailed description of how the polls are weighed and seats are projected can be found here. I will not go over that again, but rather explain some of the changes in the new model and the philosophy behind them.

First, a quick guide to what the new main projection graphic is showing (click to magnify):
At first glance, you will note that I have dropped forecasts for the election date from the chart. This is because I have decided to take a different approach to projecting electoral outcomes.

The model I used in the provincial elections of Alberta, Quebec, and British Columbia were based on a simple premise: what can the polls tell us about how they will be wrong? The inherent problem with that approach is obvious.

I was using the sample size of all the polls being taken into account by the model to estimate a rough margin of error for the projection. That gave me my high and low ranges. I then also used the degree of volatility from poll to poll to estimate the potential for future volatility in the days remaining between the last set of polls and the election day.

But if the polls are off, it does not make much sense to use those polls to guess at how they will be off. The data they are recording is not going to give a hint at the potential error if the foundations upon which they are built are faulty.

Instead, the approach for the Nova Scotia election (and, if it works well, for future elections) is to use the degree of error polls have shown in the past to assess the probable range of outcomes.

I was already using this sort of adjustment in the last few elections. This was applied directly to the poll average, using the average over- or under-estimation that the polls had done in past elections to guess at how the polls might over- or under-estimate party support in future elections (in B.C., this was done for the Greens and Conservatives and worked very well). But rather than use this information to make a best guess, I will instead be using it to give a likely range of outcomes.

This is calculated based on a party's position in the legislature at dissolution: the governing party, the Official Opposition, a third party with multiple seats, a third party with a single seat, and parties without a seat in the legislature. The electoral outcome for each of these parties in recent elections is then compared to the polling average.

All cases in which a party in a particular position in the legislature was under-estimated in the polls is then used to calculate the average "High". For example, the average under-estimation (when polls under-estimated a party's support) in recent elections for the governing party has been by a factor of 0.97. That means that the weighted polling average (which does make an independent estimate of "Other" support) is adjusted by a factor of 0.97. The same is done for cases of over-estimation to calculate the "Low", and these numbers are then used to project the number of seats that can be won at these high and low numbers.

The minimum and maximum projections ("Min." and "Max." on the chart) are simply the worst cases of over- and under-estimation that has occurred in recent elections. That means the Alberta Progressive Conservatives in 2012 (governing party, under-estimated), the Newfoundland and Labrador Liberals in 2011 (Official Opposition, under-estimated), Wildrose in 2012 (other party with multiple seats, over-estimated), etc. In other words, it means that if the election outcome falls outside of these minimum and maximum ranges, the polls have missed the target by an unprecedented amount. That this is a possibility should not be entirely discounted.

This hopefully gives readers a full understanding of the potential range of outcomes that are possible, based on past polling performance and what the data is showing. My role is not to make bets, but to try to figure out what the polls are saying and what they aren't saying, and to give people in idea of what to expect. But to narrow it down a little, I have also included boxes showing the range of most likely outcomes for each party. This means that each box represents the degree of polling error that has occurred in a majority of recent elections for a party in a similar legislative position. The chart below spells this out for the Nova Scotia vote:
As the governing party, there is 68% chance that the electoral outcome for the New Democrats will fall within the average-to-high range. This is the range that is highlighted in the main projection chart. If we want to extend that further, we can say there is a 79% chance it will fall between the low and high marks, or there is an 84% chance that it will fall within the average to maximum range. This suggests that we should expect the polls to under-estimate NDP support - though that is not necessarily what is going to happen.

For the Liberals as the Official Opposition, the range is not so tight. The most likely individual outcome is for the result to fall within the average-to-high range (42%), but it is more likely that it will fall outside of that range (the remaining 58%). To find the smallest range that incorporates the most likely outcome, we have to stretch that to the low-to-high range. There is a 63% chance that the outcome will fall within that range.

For the PCs, a third party with multiple seats, a slight over-estimation is the most likely individual outcome but we have to stretch the range also from low-to-high to get to a 67% chance. For the Greens, an over-estimation is almost certain: there is a 60% chance the outcome will fall within the low-to-average projection, and a 95% chance that it will between the minimum-to-average.

This is the extent of the probabilities that will be calculated for the province-wide projection. I was not pleased with the picture the probabilities painted in the B.C. election: the NDP was given a 98.3% chance of winning the popular vote, and a 83.3% chance of winning the most seats. Technically, that doesn't mean the probability forecast was wrong. Realistically, the polling data may not have supported a 1.7% chance of the B.C. Liberals prevailing. The seat projection probabilities were probably closer to the mark (expecting this sort of "B.C. Surprise" in four out of every five elections is not entirely unrealistic), but I've decided to drop this calculation for the time-being until I have the chance to go over in more detail what these calculations were based upon. I think the seat probabilities are on the right track (they were based, after all, on 308's track record), but the vote probabilities may have in the end been based on inappropriate data.

Probability calculations for individual seat calls, however, will remain a feature of the new model. They performed very well in the B.C. election, the first time they were calculated:
In the ridings where the confidence of a correct call was between 50% and 59%, 60% were called correctly. When the confidence was between 60% and 69%, the accuracy rate was 71%, and so on, as the chart above shows. The results of the B.C. election have been incorporated into the probability calculations, which has had the effect of narrowly boosting them upwards (yes, one of the ridings is called with 100% confidence!).

Unless more detailed polling data is made public, the model will not be presenting projections at the regional level. The model is designed for regional-level polling, however, splitting the province up into Cape Breton, the Halifax Regional Municipality, and the Rest of the Mainland. Corporate Research Associates divides the province like this, but I don't know yet whether this information will be available throughout the campaign. If this information is not available, regional support will be 'estimated' from province-wide polls in order for the data to be plugged into the model. I don't suspect a problem related to this, however. I got a peek at CRA's regional results for their last poll, and they generally mirrored what a proportional swing would calculate from province-wide results. It seems that, at this stage at least, support for the parties in Nova Scotia are increasing and decreasing proportionately at the regional level.

I am hopeful that these new measures will provide a good result in Nova Scotia. But it is a small province with which to launch a new methodology. The populations of Toronto or Montreal, not to mention a few other cities, are larger than that of Nova Scotia (accordingly, I am hoping to cover their mayoral elections). Why should readers who don't live in the province be interested?

Well - the province does have interesting elections. There are three competitive parties and a lot of ridings where all three parties have a chance of winning. The NDP government there is only in its first term and is the first to be elected in Atlantic Canada. A rebuke may not mean anything in particular for the federal New Democrats, but it certainly does not bode well. The Liberals have been buoyed throughout the region, perhaps due to the federal party's new appeal, and have not been in power in Nova Scotia since 1999. For the PCs, they need to get out of the trough that delivered them their worst performance in provincial history in 2009.

If that isn't enough, provincial results in Nova Scotia do track with those of the federal parties in the province generally well - if not always explicitly, at least relatively (they seldom go in different directions).

So, perhaps the provincial election result in Nova Scotia will tell us something about how the federal parties will do in the province in 2015. Perhaps not, however. The numbers generally tracked pretty well until recently.

I don't suspect a huge amount of polls for the race in Nova Scotia, unless one of the firms feel they have something to prove. But hopefully there will be enough to keep things interesting.

19 comments:

  1. This new way of presenting the model is very interesting! It's going to take me some time to get my head around it, but it looks like an excellent way of showing the probabilistic nature of the projection.

    ReplyDelete
  2. Thanks for your analysis Eric!

    ReplyDelete
  3. Just as an aside

    This Senate scandal thing just gets bigger and bigger.

    Harper's hope that this would all blow over by summers end ain't gonna happen. In fact as the summer has gone on the story just gets stinkier. And Harper's CPC at the moment owns it !! Sure there are four Senators involved three of which Harper appointed !!

    ReplyDelete
    Replies
    1. I think the Liberals own this scandal as much as the Tories. Mac Harb is under investigation. Who can forget another fine Liberal politician Sen. Andy Thompson whose residence in Mexico allowed him little free time to conduct his Senate duties. This particular scandal is as old as the Senate; I don't expect it or the Senate to disappear anytime soon.

      Delete
    2. The difference being that three of the four Senate bad persons where appointed by Harper.

      Why do you think he wants to prorogue ?

      Delete
  4. "The minimum and maximum projections ("Min." and "Max." on the chart) are simple the worst cases of over- and under-estimation that has occurred in recent elections."

    Have you considered a more statistical approach to that? If you take a large enough sample of previous provincial elections, can you get a reasonable probably distribution for how far off the polls could be?

    It seems silly to just use a couple of elections in isolation to establish a rough range. It would be far nicer to see something like 67% and 95% confidence lines or some such.

    ReplyDelete
    Replies
    1. The probabilities calculated for the ranges already do that - limiting it to 95% or some other number would not make much of a difference.

      And the elections in B.C. and Alberta have so impacted the perception of polling accuracy (or inaccuracy) that I find it useful to give people an idea of what an error of that magnitude would look like.

      Delete
    2. Also, to be frank, I thought I had done something relatively sophisticated for the B.C. election, but the polls turned out to be useless and all the effort I put into it along with them. So, I prefer to keep things simple until I start rebuilding some confidence in the polls.

      Delete
    3. Well, a 95% or 67% confidence level would at least give us an idea what each range means. We all know the polls can be off, but what would be helpful to see is exactly how off how often.

      Who's to say the polls won't be off even more than Alberta for that matter. If the errors can be fit to a normal distribution (or any other distribution for that matter) then we get an idea of how likely it is the polls will be even worse then that case.

      Delete
    4. Well, as I said the ranges already do give an idea of probability. The low and high ranges have a confidence level of between 63% and 79%, depending on the party.

      I'll grant you that I could have limited the min and max to 95% instead of 99%+, but I'm not sure how much more interesting or helpful that would be.

      The max and min projections are meant to show the absolute best and worst case scenarios, that there isn't any real way for the model to predict errors of an even greater magnitude. And the data set isn't overly large - I don't want to give a false impression of confidence in what the data can show. Considering the problems polls have been having, I think we're looking at a disproportionate amount of noise.

      Delete
    5. I guess my point is that there are no absolute best and worst case scenarios. It's just a question of how likely or unlikely something is.

      Delete
  5. Unfortunately Eric, your model doesn't account for vote-parking...which is exactly what we're seeing in NS right now. The 2013 McNeil NS Libs are benefiting are best compared to the 2010 Duguay NB NDP..

    ReplyDelete
  6. To be honest, I'm not really surprised that there's such a flux and craziness in Canadian polling. It's true throughout Canadian politics. Start from the point that only 2 of Canada's 14 first ministers (Ghiz & Wall) have actually served an entire term, and go from there...

    ReplyDelete
    Replies
    1. Argualbly, Dexter and Aariak both of whom are in the fifth year of their mandates have completed a full term.

      Delete
    2. Stephen Harper has completed more than a term as well.

      Delete
  7. Someone made a comment on the NS projection page about the projection for Cumberland North. I didn't mean to allow comments on that page, so it wasn't posted.

    The commenter pointed out that Cumberland North was an anomaly in 2009 because it had an independent, former PC MLA on the ballot that split the PC vote. He is absolutely right, so the next projection will reflect a more accurate forecast for Cumberland North.

    ReplyDelete
  8. With these numbers the campaign will be critical. Conceivably with a minor swing any party could win a plurality of seats although obviously the Liberals must be considered favourites at this point.

    Dexter earlier indicated he would call a vote in Spring of 2013. With that deadline passed and the polls not favourable I wonder if he will delay until Spring 2014 when his mandate will be up?

    ReplyDelete
  9. To repeat my comment that (rightfully) died with the projection post's comment section: Cape Breton losing 5 incumbents would be an unprecedented event, no matter what the regional swing.

    ReplyDelete
  10. I still dont understand what happened to the NDP government in NS that led to there defeat? It seemed they were doing good, had good policies, what happened?

    ReplyDelete

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.