Thursday, March 3, 2011

Seat projection methodology

When individual polls are reported upon at ThreeHundredEight.com, seat projections are often reported as well. The following is a detailed description of that seat projection methodology. This is the same methodology used during and in the run-up to election campaigns, with some minor differences.

The results of each poll can be plugged into the seat projection model to give likely results, based on the premise that the results of an election would be exactly the same as the results of the poll. Assuming that to be the case, the seat projection model has a margin of error of only 3.2 seats per party and makes the right call in each riding 85% of the time.

At its core, the seat projection model uses a simple proportional swing method based on the difference between the results of the last election and current polls. Put simply, if a party managed 20% in a given region in the previous election and is now polling at 40% in that same region, their results in each individual riding would be doubled. The image below shows how this method would have estimated the NDP's support in the riding of Trinity-Spadina in the 2011 election.
This swing is applied to every party in each riding. As this will sometimes result in total support of more or less than 100%, the numbers are adjusted upwards or downwards proportionately to equal exactly 100%.

This model is in contrast to the uniform swing method popular in the United Kingdom. With that method, in the example of Trinity-Spadina the NDP's increase of 7.4 percentage points in Ontario would have simply been added to the NDP's result in 2008 in Trinity-Spadina, estimating that the party would captured 48.3% of the vote instead of 57.7%, as proportional swing would suggest. In this one case, that would put the error of uniform swing at about double the error using the proportional swing method.

The proportional swing method is a better estimation of how results change between elections, reflecting that a party with a large base of support in a riding is more likely to grow by larger proportions than a party with no real support. It can also perform well when parties make very large gains - with the actual province-wide results plugged into the model, it would have projected 60 seats for the NDP in Quebec to four for the Bloc Québécois, instead of the actual result of 59 to 4.

Taking other factors into account

The swing model, however, cannot take into account the individual characteristics of each riding. Other factors need to be taken into account.

Incumbency is the most important factor, as it applies to every riding and can have a significant effect. My own research (originally conducted in February 2011) shows that support for incumbents is far more resilient than for other candidates, and that when parties do not have incumbents on the ballot they suffer a serious loss in support. That drop equals about 10% of what the party managed in the previous election, resulting in a slip of anywhere from four to six points (all else being equal). But the incumbency effect is also determined by a how a party is doing overall. My research shows that an incumbent retains more of their vote when their party's support is dropping in the region. It also shows that incumbents who have been re-elected at least once make lesser gains when a party's support is increasing in a region, while incumbents running for re-election for the first time tend to out-perform their own party's gains.

This would seem to be reflection of the difference between a first-time incumbent and a veteran incumbent. A veteran has a more solid base of support that is harder to move in either direction, whereas a sophomore is now a much safer bet compared to when they first ran for election. They have a record of winning, whereas in the previous election they had none.

Accordingly, when a party is down in support in a poll incumbents are given a "bonus" usually worth three to five points. When a party is up sophomores are given a bonus worth about one to two points while veterans are penalized by about that much. When the incumbent is not running for re-election, the party is penalized accordingly.

The effect of having star candidates on the ballot is the largest bonus of the projection model, usually worth about five points. Based on the results of an analysis conducted in November 2010, star candidates improve their party's performance in the vast majority of cases, though the classification of star candidates is difficult to determine. This is one of the purely subjective aspects of the model, as I have to determine whether a candidate should be considered a "star" or not. This is usually quite obvious, and one of the biggest determinant factors is whether a candidate is widely considered as a star in the media (which in and of itself plays a role in how a candidate is seen by voters). Star candidates are usually former MPs or cabinet ministers, party leaders, or well-known figures from the private sector.

Floor crossing is a difficult factor to take into account, as the amount of support an MP or provincial representative brings with them can vary dramatically. But an analysis of past cases shows that the effect can be very large, with the floor crossing candidate able to increase their new party's support in a riding by about half, while the other party's support drops by about a quarter.

The presence of independents can also be difficult to model. If an independent politician is running for re-election as an independent, their vote is dropped by about one-eighth from the previous election, as has occurred in other cases. The same penalty is applied to popular independent candidates who were never elected. Independents who left or were forced out of their party caucuses are treated differently. Based on an analysis of previous cases, these candidates take a proportion of their vote share from the previous election based on the circumstances of their departure from caucus. Those who depart for positive reasons retain much more of their support than those who leave in disgrace. When the circumstances are hard to define, an average proportion is used. Those votes come directly from the party the candidate left.

By-elections are also taken into account. When the result of a by-election was significantly different from the results of the previous general election, the proportional swing is applied to the by-elections results based on how current polling levels differ from where the parties stood in the polls at the time of the by-election.

A post-mortem on the factors described here for the 2011 federal election campaign can be found here.

The particularities of an election

When necessary, the projection model takes into account any individual particularities of a jurisdiction. One common particularity is the presence of a new party, or a formerly fringe party running high in the polls.

When a party is running candidates where they did not have a name on the ballot in the previous election (whether that be limited to a handful of ridings, as often occurs with smaller parties, or in the bulk of ridings, as occurred in the 2012 election in Alberta for Wildrose), the regional poll result for the party is applied directly to the riding. For example, if a party is polling at 20% in a region it will be projected to have 20% in each riding in that region. However, that number can be adjusted by any of the factors listed above and is always adjusted when the model makes all of the projections add up to 100%. In this example, in ridings where there is little room for the party to have 20% their vote will be adjusted downwards. When there is a lot more room, the vote will be adjusted upwards. This system performed well when the real results of the 2012 Alberta election were applied: Wildrose would have been projected to win 18 seats (instead of the actual result of 17).

Precision when there needs to be uncertainty

A seat projection for an individual poll will appear precise: a party is projected to win X seats, no more and no less, with what the poll reports. But just as the poll itself is susceptible to any number of sampling errors, the projection is susceptible to the model's own errors. These are further amplified by the margin of error in the poll.

Seat projections need to be read with a good understanding of what we don't know. An error in the poll by one or two points could change the results in of dozens of seats if a race is close enough, and that would still be within the margin of error of the poll. And an error in the modelling of a half-dozen seats could mean the difference between winning or losing an election. The seat projection for individual polls is a best guess and the most likely outcome - but is no more foolproof than the poll it is based upon. A grain of salt, which needs to be taken with every poll, also needs to be taken with the seat projections.

35 comments:

  1. This sort of deep projection is always what I hoped threehundredeight would eventually offer. I first found 538 because I had read Nate Silver's work on baseball projection for some years prior to 538's founding, and I thought it was a terrific field to which to apply the same sort of statistical approach.

    And from 538, I heard about you. I knew that the level of data available to you wasn't anything like what Nate got to use at 538, but I hoped that you would ultimately create a projection system that would call races riding by riding.

    Bravo, sir. You have done tremendous work here.

    ReplyDelete
  2. Fantastic stuff - great to see the details!

    Just curious - although your previous model couldn't call things on a seat-by-seat level, how did it do at the total numbers for each party? I.e. how much change did it make to the bottom line?

    ReplyDelete
  3. Looking at your current projections, other than a couple of decisions you make that seem odd, I would say you have 3 that seem wrong to me and then there are several that really will be toss ups and can not be accurately projected before the election.

    In general I think your model is getting there but with 308 ridings, there is a lot of local tweaking that needs to go on.

    Esquimalt Juan de Fuca was held by the MP and not the party - people here forgave him his Liberal connections and voted for him narrowly. With Keith Martin retiring it kills the Liberals. Both the Conservatives and NDP are already in full campaign mode.

    North Vancouver, projecting a Liberal win is odd. The 2006 result was low for the Conservatives because the candidate was not popular. How much incumbency bonus do you give Andrew Saxton?

    In Knigsway I have real trouble seeing Don Davies of the NDP losing. I assume you assigned him an incumbency bonus. Don Davies is significantly more popular than past MP Ian Waddell

    ReplyDelete
  4. Ira,

    Thanks, that is appreciated.

    jbailin,

    It wouldn't be quite fair to use the old model to project for 2008, as the model was based on past elections. So it will always be pretty close to past election results, since those results were what determined the seat distribution in the model.

    But it would have projected 23 Conservatives, 8 NDP, and 5 Liberal seats in BC.

    The new model would have projected 24 Conservatives, 9 NDP, and 3 Liberals.

    The actual result was 22 Conservatives, 9 NDP, and 5 Liberals.

    ReplyDelete
  5. Bernard,

    Esquimalt - Juan de Fuca is a tricky one. Everyone says that it is a Martin riding and not a Liberal one. But when Martin wasn't a Liberal, the Liberals didn't do horribly (22% to 26%).

    So, there is a bit of a base there. And after three elections, old habits die hard.

    I'm afraid I can't just go on a gut feeling or local perception on this, as there aren't any numbers to back it up. So I have to treat it as I do every other Liberal riding without an incumbent.

    As for North Vancouver, it went Liberal in 2004 and 2006 and was a close race in 2008. The Liberals were also at 30%+ in the 90s and in the 2000 election. Saxton, as a sophomore MP, gets a small incumbent bonus.

    For Vancouver Kingsway, the Liberals used to hold the seat and it is difficult to know how much of their lost vote in 2008 was due to Emerson jumping ship. Davies gets the bonus as an incumbent, of course, but the NDP is not doing very well in BC. In any case, as you can see, it is a close race. A couple NDP ticks up and Liberal ticks down will put it back in the orange.

    ReplyDelete
  6. London West will almost certainly go Liberal in the next election. Doug Ferguson is an exceptionally strong candidate and very popular locally. He also has an excellent organization online and on the ground. Conservative Ed Holder has been caught misleading constituents with his regular flyers and for the most part, had very little success securing Harper favours for the riding during the stimulus spending: http://www.lfpress.com/news/london/2011/02/09/17216431.html

    ReplyDelete
  7. Bravo! can't wait to see the seat-call accuracy nationwide.

    Can you comment on your plans if an election comes around RE: projection updates? since polls will be coming out every few days if not daily, will you be doing a full update very frequently to get the most recent data into your projection?

    ReplyDelete
  8. Anonymous,

    Another close race to watch.

    Bryan,

    Just as I was during the New Brunswick election, I expect to be very busy if a federal election is called. Ditto if there will be a BC election in the spring/summer, and of course for the myriad of provincial ones this fall.

    My current plan is to do a resume of all the polls released in the preceding 24 hours and a projection update every morning, with the afternoon or evening open to posts on other topics.

    ReplyDelete
  9. Éric - I think you've made the right decision on Esquimalt-Juan de Fuca. Yes, there is reason to believe that the projection's result isn't necessarily accurate, but the point of building a projection model like you have is to stick with it and see how it does.

    Based on the next election result, maybe you'll learn something that will allow for a tweak (maybe some adjustment that you make only for retiring MPs who previously crossed the floor - who knows?), but for now I think it's important that you leave the projection as it is until you have more data.

    ReplyDelete
  10. As the big driving force for proportional projections it does feel a bit good to see that I knew what I was doing :P

    ReplyDelete
  11. Bernard -

    In 2008, the Liberals in BC hit a 26-year low at 19% (Dion + the Green Shift were the main culprits).

    Over the previous 5 elections, the Liberals garnered 28% of the vote in BC. I'll wager that the Liberals will rebound in BC compared to 2008.

    The Liberal's core support in BC is within the City of Vancouver and neighbouring environs as well as Greater Victoria, to a lesser extent.

    The Libs in EJDF have selected a poll topping councillor from Langford and I can see the electoral dynamic in that race to be "Vote Liberal to keep out the Con", since the NDP was so far behind in 2008.

    In V-K, the NDP seems to have hit a glass ceiling and the demographics are more Liberal friendly. I can also see Yuen taking that riding back for the Libs.

    Same for N-V, where the incumbent MP is a social conservative and the demographics are more Liberal friendly.

    IOW, I wouldn't count the Liberals out winning 1 - 3 of the aforementioned seats.

    ReplyDelete
  12. Eric is there any bonus for parliamentary secretaries ?

    When you look at somebody like Shelly Glover, she does more announcements and is more visible than some ministers.

    Its a signal that these people are rising stars and that the party is going to fight to protect them.


    I wouldn't be surprised if there was some modifier you could determine and apply.

    ReplyDelete
  13. No. It might be something to look into because of the extra visibility, but I'm not sure if it is worth all that much.

    The average voter probably knows if their MP is a cabinet minister, but parliamentary secretary?

    ReplyDelete
  14. Incidentally, my argument for why you shouldn't make special adjustments for places like Esquimalt is the same argument Nate Silver used for why he shouldn't make special adjustments to his baseball projections for Ichiro Suzuki. Nate knew he projection system (PECOTA) was going to get Ichiro wrong. He knew it would be wrong every year, and he even knew why and how it would be wrong. And while he adjusted the system overall from time to time, he never made specific adjustments for atypical players, because that wouldn't be allow for a fair assessment of his projections overall.

    Here's an article Nate wrote about PECOTA's specific failures back in 2004:

    http://www.baseballprospectus.com/article.php?articleid=3497

    ReplyDelete
  15. Congratulations - extremely impressive model, and fascinating to read the gloss.

    ReplyDelete
  16. "The average voter probably knows if their MP is a cabinet minister, but parliamentary secretary?"

    My MP got a front page story in both local newspapers. It was kind of a big deal.

    ReplyDelete
  17. Shadow - I can't recall ever having seen my MP's actions reported in the paper (granted, she's in opposition).

    ReplyDelete
  18. I tend to agree with Ira on the dangers of tweaking ridings.

    Once you start modifying individual ridings or subjectively judging who is and isn't a star candidate, it becomes more of an exercise like the Election Prediction Project, than a statistical projection system.

    Still, most of these additions, especially incumbency, are welcome. The only danger I'd caution is in over-fitting the model. You're validating the model using the same data it was constructed on, so obviously it will be effective. The real test would be to try and validate it using a different data set (i.e. 2006 election).

    ReplyDelete
  19. CalgaryGrit,

    I think including a star candidate factor, even if arbitrarily applied, is important. There is no other way to represent the big shifts in support that star candidates bring to the table. My primary concern is with the model being accurate. Being uniform is secondary.

    And taking into account individual factors (Elizabeth May, candidates dropping out, prominent independents) is very important if the model is to be taken seriously. It would be unbelievable to have Ms. May at 10% in her riding, or Helena Guergis at 1%.

    As to your second point, I've done my best to avoid the issue. All of the factors are based on what happened in both the 2006 and 2008 elections, with a few of them using even older examples as well.

    The tests were done only in British Columbia to, in part, mitigate the concern you bring up. The factors applied to each riding were drawn from all parts of the country, so what happened in BC is diluted to a great deal by what happened in the other provinces. That the results are still accurate for British Columbia after applying cabinet minister and incumbency increases, for example, drawn primarily from Ontario and Quebec prove to some degree that the model is sound.

    And, as pointed out in the post, I have tested for the 2006 election in terms of seat calls, and the model had the same 34 for 36 result in BC. If the next election turns out to be in October 2012, I'll have time to perhaps go even further back.

    ReplyDelete
  20. Like Bernard, I question the EJF and North Van Liberal picks. With the retiring of Keith Martin, there will be many Conservatives who come home. It will be a ND/Conservative race, with the Libs coming in a very weak third. Langford is a small part of the riding and winning a few polls with the municipal level turnout doesn't mean anything.
    In North Van, where I live, the Libs don't even put a candidate in place until tomorrow and none of the potential candidates are close to Don Bell - the only Liberal to ever win this riding with a 30 year history of election wins on School Board, Council, Mayor and twice as MP until Andrew Saxton beat him.

    And Johnny Quest, Andrew Saxton the current Conservative MP is not a social conservative.

    ReplyDelete
  21. Hi Éric, thanks for your great website, I check it every day now!

    You said somewhere that your model doesn't have a MOE and that all you can do is compare how the model fared in previous elections. But, actually, I think you could produce an MOE for your model using Monte-Carlo simulations, and that would also be very instructive. Here are some thoughts on how:

    You could add noise to each poll according to that poll's margin of error. Even if you do not have the poll's precise error distribution, you could sample from a Gaussian distribution of the same spread. If not a safe bet, at least it's a reasonable assumption. It's called the Bootstrap technique and I've used it in my work before.

    The run you model thousands of times, each time with new noise samples drawn from each poll's error distribution (or Gaussian proxy). What you get ultimately is a *distribution* of outcomes, and from that you can calculate the odds of, say, the Conservatives getting 100 seats, 110 seats, 120, etc. etc. It sounds complicated, but it's really not (get back to me if you would need some programming help!)

    I think this is more or less what Nate Silver used in his 2008 model of the American presidential elections. It'd be nice to see on here, but I realize you only have 24 hours in a day, some of which should be spent sleeping!

    Cheers,

    Stéphane

    ---------------------------------------------------
    Stephane Rainville, PhD

    Founder and Senior Consultant
    VizirLabs - Helping scientists invest in their own productivity

    www.vizirlabs.com
    ---------------------------------------------------

    ReplyDelete
  22. It would be neat to have something like that, I agree. It's hard to change things mid-stride, though.

    ReplyDelete
  23. How much weighting does a "Star Candidate" get for each riding?
    My riding, Sydney-Victoria, has a new Federal candidate (Cecil Clarke) who was previously in the Provincial gov't as house speaker, minister of transport, and Attorney-General.

    ReplyDelete
  24. Roughly an increase of 14% (not 14 points). I will look into Cecil Clarke. Thanks for the local knowledge!

    ReplyDelete
  25. I agree with Bernard and George on thinking you should take another look at Esquimalt-Juan de Fuca using some of your own methodologies.

    I think you could apply New Star Candidate and Individual Oddities to this riding.
    1. New Star Candidate: Garrison's name is widely known in the community. As you may know he has ran before and beat Conservative candidate Troy DeSouza, and even came very close to defeating incumbent Keith Martin. Lillian Szpak is relatively unknown to those outside of Langford. Added to this is the fact that Jack Layton (another star factor) has come to the riding twice now to promote the NDP, one of which he held a very large rally.
    2. Individual Oddities: As previously mentioned, the history of the riding is very NDP/Conservative split. University of Victoria professor and Victoria's local expert on elections, Denis Pilon, believes the voters in EJDF will revert back to their traditional voting patterns: http://www.youtube.com/watch?v=zjVfOMmLE1A#t=1m07s. Other local experts have also weighed in on this is a contest between Randall Garrison and Troy DeSouza (see: http://tinyurl.com/6bgd8k8 & http://tinyurl.com/3g79lll & http://tinyurl.com/3zp3mya & http://tinyurl.com/3bovwhc )

    You've applied similar changes to other ridings, so I would ask you use local knowledge to do the same in this case. Thank you.

    Mark

    ReplyDelete
  26. Are there any examples of ridings with a similar history to Esquimalt - Juan de Fuca?

    It would need to be a riding with a floor crosser who was re-elected with his/her new party, but then retired and did not stand for re-election.

    If there are a few examples of this, and if I can see some consistent behaviour, I will consider tweaking EJdF.

    ReplyDelete
  27. Yes there are a few floor-crossers that resulted in similar situations:

    -David Emerson, elected to Vancouver Kingsway, crossed the floor from Liberal to Conservative in 2006. This is perhaps the most similar to Esquimalt Juan de Fuca as the riding is considered to be historically very left-leaning. In 2008 he did not seek re-election and the riding reverted back to its historical patterns and elected an NDP candidate.

    -Belinda Stronach, of Newmarket-Auror, crossed the floor from Conservative to Liberal and won a re-election after crossing. When she left politics in 2008, the riding reverted back to Conservative.

    ReplyDelete
  28. It appears tooclosetocall.ca, which uses a similar model to yours (i.e. star factor and individual oddity) has manually edited their projection for EJdF. He has a specific blog post outlining how it was done. http://www.tooclosetocall.ca/2011/04/modifications-to-esquimalt-huan-de-fuca.html#idc-container

    ReplyDelete
  29. Mark,

    David Emerson is not a similar situation as he never stood for re-election as a Conservative.

    Belinda Stronach's experience is a good one, but I would need more than one such example.

    ReplyDelete
  30. Éric,

    Un message pour dire bravo pour avoir construit un excellent site. Tes analyses quotidiennes sont excellentes. Continue le bon travail!

    ReplyDelete
  31. Hi,

    Curious as to your background in stats. Not looking to critique, but rather to know what sort of education formal or self-taught is required for this kind of modelling.

    Any reference appreciated.

    Thanks for the site, it's a great resourse.

    ReplyDelete
  32. Mike,

    This post (linked to under the LE DEVOIR image) provides you with some background information about myself:

    http://threehundredeight.blogspot.com/2008/10/introduction-to-threehundredeightcom.html

    ReplyDelete
  33. So how exactly do you calculate it?
    IS it an excel file?

    ReplyDelete
  34. Yes, I use a spreadsheet program.

    ReplyDelete
  35. Have you ever thought of making a simplistic version so people could "Make their own predictions"

    ReplyDelete

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. Please keep discussion on topic.