Monday, December 3, 2012

BC New Democrats hold wide lead in first projection

The next election in British Columbia is less than six months away, and in ThreeHundredEight.com's first projection and forecast for the province the New Democrats hold a wide lead over the governing B.C. Liberals. They are also heavily favoured to win, even six months out.

An analysis of the projection results themselves can be read at The Globe and Mail here. Rather than go over the results a second time, I will explain some of the projection's new features.

A complete explanation of every aspect of how the projection and forecasting models work can be found here. For the most part, the basic methodology has not changed so the bulk of that explanation is covering what I have already described in detail before. In this post, I'll go over the features that have not been presented in past elections.

With this new model, I wanted to tackle a few problems but also take a somewhat different approach to how projections are presented. The most important problem is how to bridge the gap between what the polls show and what the election results end up being. The amount of data available in Canadian elections is relatively low, especially compared to the United States and particularly at the tail end of a campaign. It is common for the final polls of a campaign to be conducted days before the actual vote, leaving a big gap between the last information available and the time that people cast their ballot.

The projection now makes two calls. The first, which is what I call the projection itself, is the best estimate of a likely outcome if an election were held on the last day that the most recent poll was in the field. In the case of British Columbia, that is Angus-Reid's poll of November 21. The projection, then, is the best estimate of the outcome of an election held November 21.

In order to measure the degree of error that can be expected in the polling, high and low projection ranges are tallied. This is determined by applying the margin of error for the estimated "sample" of the projection itself. This sample is measured by comparing the weights and sample sizes of other polls to the most highly weighted poll in the projection. For example, let us assume that a projection has three polls in the database. All three have a sample of 1,000 people. One poll is rated at half the weight of the most highly rated poll, while the third poll is weighted at one-fourth of the most highly rated poll. The estimated "sample" of the projection would then be considered to be 1,750 people (1,000 people from the first poll, 500 from the half-weight poll, and 250 from the quarter-weight poll). That 1,750 sample is then used to determine the margin of error for each party, which changes according to the support a party holds. An easy demonstration of this is that a party with 2.5% support has a different margin of error than a party with 25%. If the margin of error in the poll is 3.1%, the range of results for the small party can't be as low as -0.6%.

By calculating the margin of error attributed to the projection's "sample", that is used to determine the likely ranges of the projection itself. Theoretically, this range is based on the assumption that the polls have accurately reflected the mood of the electorate, within their respective margins of error.

The current projection is heavily skewed towards Angus-Reid's last poll, since the amount of data is rather thin. This makes the "sample" of the projection relatively small, not much larger than the 800 people surveyed in that poll. For that reason, the high and low vote projections are rather wide. If more polls were available, that margin would be narrower. The high and low seat projection is based on these high and low vote projection ranges. Accordingly, that range narrows once more polls are available.

The second call made by the model is what I refer to as the forecast. Having a background in History, I am partial to the idea that the past can tell us a lot about the future. The forecasting model is based entirely on this premise.

In addition to the projection, the model also gives the plausible high and low results each party might be able to manage by election day, tentatively scheduled for May 14, 2013. Unless the polls begin fluctuating wildly, the margin will narrow as Election Day approaches.

The ranges are determined by measuring the degree of polling volatility in the past, with the period examined being equal to the amount of time before the next election. For example, the next election is scheduled for 174 days from the date of the projection, so the ranges are determined by the difference between the highest and lowest poll result for each party over the last 174 days. This is a measure of what kind of change in support is plausible based on how much that support has changed in the past. Of course, what is plausible does not equal what is possible - in theory, a party can get anywhere from 0% to 100% of the vote. That a party at 5% could win 75% of the vote six months from now is possible, and vice versa. It is not plausible, however, if that 5% has varied by only three points in the previous six months of polling.

This forecasting also cannot take into account completely exceptional events, but it is a best estimate of the plausibility of a party gaining or losing a certain amount of support.

The forecast will be continually updated as a measure of what should be expected, based on current information. Accordingly, it will change and the forecasts now may not overlap with the forecasts one month from the election. It is, instead, a best guess of what to expect based on what we know now.

In the same way that the seat projection model gives a range of likely outcomes based on the vote projection, the seat forecasting model gives a likely range of outcomes based on the vote forecast. These, of course, vary wide as the election date is far away. They give the range of plausible outcomes for the next election based on the information available to us right now.

Note that the seat forecast is not the same kind of assessment as the seat projection in terms of what seats are at play. Currently, the Conservatives are given a high forecast of 23 seats and the Greens of six seats. This does not mean that we should expect the Conservatives and Greens to win this many seats. In the case of the Conservatives, it means that if the party does end up at 25.8% support (their forecasted high, which would almost certainly mean the B.C. Liberals have dropped considerably), they could win as many as 23 seats. In the case of the Greens, it means that if the party ends up at 16% support (their forecasted high) they would be in play in as many as six seats. The one is dependent on the other. The Greens are not going to be at play in six seats at 7% support or even the high projected result of 8.7% support in the first projection. The model considers that they are at play in no seats at current polling levels.

One new feature added to the model is the probability that a call made by the seat projection model will be correct. This is based on an analysis of the seat projection model's performance in the eight elections that it has made projections for individual ridings. This probability is determined by the margin the projection model estimates the winner will win by. The following chart tracks how the projection has performed in the past, based on the projected winning margin in each riding.

Each red dot shows the percentage of calls that were correct, while the blue line shows the general trend. As the data is somewhat noisy (but the trend is still clearly visible), the trend line has been used to determine what probability of a correct call should be applied to every riding.

If the riding projection shows that a party leading in a riding by 12 points has a 73% chance of winning, that means that based on past performance the model will be right about 73% of the time when it chooses a winner by a margin of 12 points. It does not mean that there is a 73% chance that the projection for every party in the riding will be correct, or that the trailing party has a 27% chance of winning (a third place party could win as well). It is referring to the odds that the party projected to win will win.

Another new feature of the model is the ability to calculate the probability of a party winning the next election as of the date of the projection.

This is also based on the performance of the projection model in the past. In short, it determines the probability that the amount of error in the seat projection will be less than the margin between the leading party and other parties. A margin as large as the one of 40 seats between the Liberals and NDP in the current projection has been overcome in only 2.4% of cases. That means that if this was the final projection call before the election, the NDP would have a 97.6% chance of winning it. This takes into account the potential for an Alberta-sized error, but calling a winner by 40 seats would be right virtually all of the time.

However, this is not a forecast of the future. It is the confidence that can be placed on a call if the election were held the day of the projection. Forecasting the probability of a future event is an entirely different matter.

The greater challenge lies in being able to predict the probability that a party will win an election at a future date based on current information. The seat forecast shows what range of outcomes are plausible, but not which are most likely to occur.

After analyzing almost 6,000 pieces of data from polls conducted in over 20 federal and provincial elections since 2004, I have been able to determine the probability that the margin between any two parties can be overcome in a given period of time. As this analysis was based on the difference between polls and final outcomes, it takes into account both the past amount of error in the polls as well as the degree of real change that has occurred in voting intentions.

Using this model suggests how likely it is for the B.C. Liberals to overcome a 19-point deficit six months from the election, based on how often this sort of shift has taken place in the past. In this case, this sort of margin has been overcome in the six months prior to an election only 4.3% of the time, giving the NDP a 95.7% chance of winning the popular vote in May 2013 based on the polling of November 2012.

The calculations do take into account the role played by third parties and other parties, when they have a significant level of support. When the margin between the leading party and third or other parties is very large, the model assumes a (nearly) 0% chance that they could win. When three or more parties are a factor, the probability is calculated accordingly.

Calculating the probability of a party winning a future event should be very familiar to readers of Nate Silver's FiveThirtyEight blog. In his excellent book, The Signal and the Noise, Silver includes a chart showcasing the probability of a Senate candidate winning an election based on their polling on a given date. This gave me the opportunity to check my math, as it were. Here is a comparison of FiveThirtyEight's probability ratings to the ones that are employed by ThreeHundredEight:

As you can see, the probabilities are almost identical in most cases, and are often lower than FiveThirtyEight's. This may be a reflection of the complications caused by our three-or-more-party system. The numbers in the above chart assumes a two-party race, which of course is not always the case in Canada. But it is also possible to use these numbers to calculate the odds of a party winning in a multi-party race.

This calculation is based solely on the probability of a party winning the popular vote. In U.S. Senate races, that is also what determines who wins the election. That is not the case in Canada, where a party can win the most seats with fewer votes. The amount of variables at play to determine the winner of the most seats based on popular support in polls months before an election is, of course, enormous. But generally speaking, the party with the most votes will win the most seats, so while the probability of a party winning the popular vote may not necessarily determine their probability of winning the election, it is a close enough proxy. This is particularly the case in British Columbia, where neither the Liberals nor the NDP appear to have an intrinsic advantage in vote efficiency.

The projection will be updated continuously as the next vote approaches, with the forecast probabilities changing as new information becomes available. The B.C. model has been launched six months from the start date as it is the only upcoming election actually scheduled. Nova Scotia will be heading to the polls at some point soon, and that province's model will launch in the first half of next year if an election isn't already called. Ontario and Quebec could be heading back to the polls potentially even earlier than British Columbia, and this model will be tweaked accordingly and applied to those campaigns. But as B.C. is the only province we know is having an election in the next six months, the site will focus on that province's politics.

It should be an interesting campaign. The odds heavily favour the New Democrats, as a swing of 19 points even six months before an election is very rare. Much of the Liberals' hope lies with their ability to put the Conservatives back in their place. They aren't in much danger of being overcome by the Conservatives anymore, but at 12.8% the party makes it impossible for the Liberals to beat the New Democrats. Christy Clark will need to get those disaffected Liberals back into the fold if she is to have any hope of staying on as Premier.

30 comments:

  1. Even though 538 had a bad result when Nate tried to forecast the British election, I think that you would have had an extremely good and close result.

    ReplyDelete
  2. As long as the Conservatives of any stripe don't win it works for me !

    ReplyDelete
  3. Éric, if you don't mind my asking, where did you find the details for that Mustel poll conducted back in September (namely the polling dates)? I've been wanting to include it in the polling table of the 2013 BC Election Wiki page but all I've been able to uncover is the general tracking graph on Mustel's website, which doesn't indicate the exact field dates. Do you have a link to a page that provides the details by any chance?

    Dom

    ReplyDelete
    Replies
    1. I got the details directly from Mustel via email.

      Delete
    2. Ah, figures. Thanks anyway.

      Dom

      Delete
  4. The BCLibs seemed to collapse after the HST fiasco but it is hard, although I favour the NDP, to believe that is/was enough to almost destroy the Liberals. What other factors created this mess for the Libs?

    ReplyDelete
    Replies
    1. They had a brief resurgence after Clark was elected leader (were even briefly ahead of the NDP), suggesting that there was some willingness on the part of voters to forgive them for the HST, but then they plummeted again starting around the fall of 2011, I suppose following their embarrassing defeat in the HST referendum. That's also when Clark badly waffled on her original pledge to hold a premature general election to "legitimize" her Premiership, ultimately backtracking on it, and I think that's when voters began to turn on her. The BC Liberals have not managed to recover since.

      Dom

      Delete
    2. I agree with Dom. Clark's reneging on her promise to call an early election signifies the beginning of her problems.

      Premiers are monarchs in all but name so to have one in office for 2 plus years without an electoral mandate borders on tyranny.

      Delete
    3. To be more exact, it would seem the BC Libs initially got their "resurgence" bounce right after Campbell's resignation, not after Clark's ensuing election as leader. However, they did hit a high point (in all polling since the 2009 election) of 43% in an Angus Reid poll from 16-17 March 2011, the first one conducted after her leadership victory and swearing in as Premier:

      http://en.wikipedia.org/wiki/British_Columbia_general_election,_2013#Opinion_polls

      Dom

      Delete
    4. I agree with Derek. A premier running the province for such a long period of time with the only say-so from the public being the Vancouver-Point Grey by-election is dodgy from a democratic stand-point. If the Liberals had won even one more seat in Ontario last time out, we'd be looking at a similar situation now.

      Delete
    5. While I agree with all your points above, there's another reason the Libs have taken this nose dive. They've simply run their course. BC is a wonderful place. Jobs are up, the economy is up, corporate profits are up. But so are such things as cost of living, child poverty, the provincial debt. BCers have had enough and want to let the other guy give it a try -- for awhile.

      Delete
  5. There have also been revaltions of massive corruption, plus Christy Clark has turned out to be totally incompetent, plus just the accumulated grievances after 12 years in power.

    ReplyDelete
    Replies
    1. What revelations of massive corruption? Nothing has been proven in Court. The Basi-Virk case was dismissed with costs going to the defendants!

      Frankly, if we are going to throw around corruption allegations willy-nilly everyone should be reminded of the corruption, fraud and obstruction of justice the NDP frequented while in office. The current NDP leader, falsified documents in order to instigate a cover-up and obstruct the course of justice.

      Whatever one's opinion of NDP policies the fact remains Dix had an ethical lapse some years ago. He committed ethically questionable if not immoral, corrupt and illegal acts to frustrate investigators, obstruct justice to protect his job and boss. He falsified Government documents, then lied about it. Are these the actions of a man who has the best interests of British Columbians at heart?

      Then we have the fudgeit-budgets, fast ferries, Nanaimo Commonwealth bingo scandal.

      Those who live in glass houses shouldn't throw stones!

      Delete
  6. Oh Lordy !!

    The Duchess Of Cambridge is pregnant !!

    We can look forward to 9 months+ of baby babble !!

    ReplyDelete
    Replies
    1. What's the matter? Not a fan of the data?

      Delete
  7. I really like the new model, except one part sticks out as a concern to me:

    "The ranges are determined by measuring the degree of polling volatility in the past, with the period examined being equal to the amount of time before the next election."

    Is there not more volatility once the writ has been dropped? I feel like you need to add an "election" volatility bonus to account for this extra potential for movement once an election is called. I'm not quite sure the best way to calculate such a number though.

    ReplyDelete
    Replies
    1. Once the election rolls around, it will take care of itself. And at this point, looking back so far you get a lot of variation in the numbers.

      Delete
  8. Thanks for all the work you put into constantly improving the model Eric. I appreciate your dedication to ironing out kinks and trying to get to the best possible result. Hopefully, with the new model and the new way of presenting it, people will treat it less like an exact prognostication and will rather realize that it is a probabilistic statment. Of course Nate Silver got treated as producing an exact prognostication. I don't know why people are so bad even understanding the basic concept of probability.

    ReplyDelete
  9. Charles Harrison03 December, 2012 15:38

    Regional breakdown?

    ReplyDelete
    Replies
    1. Click on the big chart at the top of the page.

      Delete
  10. I love the changes you've made to the model here Eric.

    ReplyDelete
    Replies
    1. Thanks! Me too. I've been pretty excited to launch the model, and it will be really fun if Ontario and Quebec go in the spring as well.

      Delete
    2. Charles Harrison04 December, 2012 14:50

      It sure would!

      Delete
    3. For certain values of "fun" as a political partisan living in Toronto, there have been so many elections that I am pretty much tapped out, donation- and volunteering-wise. We've had a municipal election, a provincial election, a federal general election and a federal by-election (even though I don't live in T-D). We are staring a mayoral by-election and a provincial election in the face. If both happen, I may have a miny breakdown.

      Delete
  11. Today's new BC Ipsos-Reid poll:

    NDP - 48% (-1%)
    Liberal - 35% (+3%)
    BCCP - 9% (-3%)
    Green - 7% (+1%)

    ReplyDelete
  12. Here's the Ipsos-Reid poll link:

    http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5915

    ReplyDelete
  13. Thanks, got it. Update tomorrow.

    ReplyDelete
  14. I wonder if pollsters look at 308 and check to see if their results are outliers or not.

    ReplyDelete
  15. Wondering what weight, if any, you give to the recent federal by-election result on the Island?

    The federal by-election Green vote was significantly higher than what provincial polls are showing: Can anything be read into that? Is there anything to gain by comparing federal party polling (NDP & Green in particular) to provincial party polling results?

    Or is the only information worthy of consideration comparing the results to your model and learning how to refine your model?

    ReplyDelete
    Replies
    1. I think it certainly bodes well for the Greens, but unless the provincial party starts putting up better numbers on Vancouver Island the model will not give them better numbers either.

      Delete

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.