Thursday, May 5, 2011

Weighing the polls

On Monday night, ThreeHundredEight's seat projection model failed for two reasons. First, the 7% reduction in weight given to polls for each day that passed was not nimble enough to capture the huge shift in voting intentions in the last two weeks. Secondly, the polls under-estimated Conservative support by over two points nationally, three points in Ontario, and four points in British Columbia. As the seat projection is only as good as the polls that are put into it, this is part of the reason why the seat projection failed.

So what steps can be taken to fix the problem? This fall, a flurry of provincial elections will occur and ThreeHundredEight will be endeavouring to do a much better job in accurately projecting their outcome. Tuesday's post demonstrates that the projection model can produce good results, and tweaks will be made to make it even better. But clearly the vote projection model has to be overhauled.

The first step was to throw out the weightings given to the last three elections. The idea was to use these elections as an "anchor", ensuring that the projection would not be thrown off too much by the fluctuations in the polls. For example, using this anchor was an effective way to keep the Green vote below the unrealistic support levels at which various polls put them.

But in a wild campaign like that of 2011, the anchor was unnecessarily weighing down the projection. Indeed, instead of being off by a cumulative 12.8 points, the projection would have been off by 12.2 points without the anchor. That means it's gone.

Next, I played around with the vote projection model and changed only the rate of depreciation of each poll's weight by age. The following chart shows the cumulative error between the vote projection and the actual result for the five main parties at various reduction rates (i.e., 0.75 means a daily reduction of 25%).
As you can see, the best weighting was between a daily reduction of 20% and 30% (0.80 to 0.70). Delving more deeply, I found that a reduction of 23% per day had the best result, bringing the cumulative error down to 6.8 points for the five parties, an average of 1.4% per party.

Rates of 21% and 22% also resulted in a cumulative error of 6.8 points, but after looking at how it would have done at the regional level for Ontario, Quebec, and British Columbia, I settled on 23% being the closest.

Nevertheless, the projection would have still been off. Though it would have been very close for the New Democrats and Bloc (on the money for the Bloc and off by 0.2 points for the NDP), it would have been further off for the Conservatives (36% rather than the actual 39.6%) and the Liberals (20.5% instead of 18.9%). In fact, for the Tories it was impossible to get closer than 36.3% using a weighted average of all polls.

The pollsters are partly to blame for that. If I dropped the weighted projection completely and simply used the average of the final poll from each polling firm released within the last five days of the campaign, I would still have only gotten 37.4% for the Conservatives, off by 2.2 points. Most significantly, the Conservative average would have low by 4.6 points in British Columbia and 3.5 points in Ontario - that is where the Tory majority was won.

Nevertheless, abandoning the weighted vote projection for a simple average would have given a better result, with a cumulative error of 5.8 points. The average cumulative error in British Columbia, Ontario, and Quebec would have been 8.5 points, rather than the 9.6 points when using a reduction rate of 23% in the weighted average.

At this point, one would suggest that I move to a polling average of only the most recent polls and be done with it.

However, the 2006 election tells a different story. There are some parallels between the 2006 and 2011 elections, as they both featured a shift in public opinion during the campaign. Using a weighted average of all polls with a reduction rate of 10% per day would bring about a very good result, with a cumulative error of 3.3 points at the national level. Any other reduction rate pushes the error to 3.9 or 4.0 points, which is still very acceptable over five parties.

But using only the final polls of the campaign would result in a cumulative error of 5.8 points. In other words, while a simple poll average would have been more accurate in 2011, the weighted poll averaging system that I use in my popular vote projection would have been more accurate in 2006. And of all these scenarios, using the average of the last five days of polling in the 2008 election is worst of all at a total error of 6.9 points.

If ThreeHundredEight is still around for the 2015 election, some sort of ballot box effect might have to be taken into account. Obviously, a ballot box effect at the federal level cannot be translated to the provincial level - the parties just aren't the same. But this daily weighting of 23% appears to give good results, and so will be used for the provincial projections this fall. Adjustments for polling error may also be investigated for each of the provincial elections and included in the model, while the track records of each of the pollsters will also be adjusted in the wake of this election.

On that note, I will be taking a detailed look at how the pollsters did (briefly hinted at here) next week.

UPDATE 06/05/11: Running the projection at 23% daily reduction, I would have projected 143 Conservatives, 102 New Democrats, 49 Liberals, and 14 Bloc seats. While it would have been a much better projection, it still would not have called the Conservative majority - just like the polls did not.

1. I don't think that given the volatility of the period ANY polling system or other public opinion sampling will in the least be accurate.

Eric you were essentially aligned with the majority of the for profit pollsters!!

Better than that I don't think we can expect !!

2. You've received quite a bit of criticism for various details of your methodology (the old provincial seat caps, for example, or your use of the old election data), but I need to point out that those criticisms did't have any data to support them.

Regardless of how accurate any of your predictions ultimately prove to be, you're doing tremendous work at advancing the knowledge in your field. As far as I can tell, no one has previously done this sort of analysis for Canadian elections, and that means that no one can really say which method works.

If it matters, measure it. And that's exactly what you're doing.

Thanks for all your hard work.

3. I think you've demonstrated that had you known the popular vote election results than your model would've predicted the seat distribution pretty well. That's an accomplishment to be proud of! The problem is that nobody knows the vote intentions of the (mostly irrational) voters and (unfortunately) you cannot adjust your model to deal with this :-(
However, by no means I suggest your efforts to improve the model are useless. I'll be checking 308 regularly. Cheers!

4. It is all comes down to this GIGO. Garbage in, garbage out.

As for the polling companies, they tend to ask eligible voters, NOT likely voters. Whereas if you look at polling done in US elections, there polls taken by three different types of voters. Elegible voters, registered voters and likely voters. The higher the intensity from the voter will lead you to a better result.

The moral of the story is that polling companies in Canada need to ask likelihood of voting as compared to who would you vote for. This was done on a couple of polls in the recent election and it showed higher Conservative and NDP support. If you are applying certain weights to different polling results, it would make sense to add more 'weight' to those polls that measured likely voters as compared to individualss who are not likely to vote.

5. It's OK. Don't be sorry. Your method was even the best in the circomstances.

But Please, post an uptate of the riding polls with the results. (Past election, riding polls results corrected, final results).

Nowhere, we can see this synthesis of the results. Nowhere we can understand in one look where it happened. Every website has a very complex showing module of results and we never can compare. Also, it could be great to include the name of the candidates (past and new) and for those who were there before the election, since when.

Don't be sorry. Voters themselves don't know what they did. Your job is to reveal the facts... and you do it very well !

Thanks !

Simon

6. Eric:
thanks for the analysis on the methodology
but I think you need to consider a few things:
1. the undecideds...there were about 7 - 10% undecided in the last polls;
2. Harper appeal to Blue Liberals - cant be detected until an exit poll;
3. possible bin laden factor - it was all over the news - could reinforce the "stable govt" message of the Tories.

I still think that you should project on every last minute poll adn just issue a band of numbers for your seat projects.
Let the readers decide if they want to take a mid-point or maybe check it against the individual seat projections, assess some weight for incumbents or local conditions etc.

how can you predict the election results based on poll info that may be accurate for the day before at best when there's a high undecided.

all we can do really is look at the trend - this was what was most beneficial about Nanos and Ekos traking polls over the last week of the campaign.

anyways, keep up the good work, as always!

7. It's a pity that interest in the site will probably flag a bit until the next set of elections. Seeing how you adjust the model is probably going to be more interesting than the actual seat projections were.

I find it interesting that you couldn't find a way to get the Conservative vote right, and had to conclude the pollsters were simply too far off in their own numbers. In the run-up to the election, many people pointed out that the polls frequently tend to underestimate Conservative support and overestimate Liberal support. It's nice to have empirical confirmation of that. Is that due to bias, flawed methodology, or just differences in the likelihood of which supporters will actually show up on election day.

8. No model is going to be even close to perfect for every election. What you did was still quite amazing. You still called the NDP to be second by a good margin of seats. I blame most of the inaccuracies on the polls not being correct. Keep up the good work and I hope you will be around in 2015 to give it another go. Provincial elections should easier because most have only two parties of any significance.

Good luck

9. No offense but this seems like very sloppy science. You're trying to fix the model by just copying whatever numbers would have worked out best in this most recent election- why? The most recent election was completely unexpected, in multiple ways, so there's no reason to think that the next election will follow the same pattern. This seems like trying to predict the future direction of the stock market by measuring the market average going backwards.

If you want a more robust result, you need to find some logical reason to use a particular weighting or poll-average. Is there some reason why we can expect polls to be reduced bey 25% per day? Probably not, but perhaps there was a way you could have better predicted that number in ADVANCE, rather than just picking it after-the-fact.
-Luddite

10. Luddite,

Using a 23% daily reduction would have worked best in the 2011 election and would have produced good results in the 2006 election. I am not suggesting it is a magic number, but I have to choose some method of reducing the weight of polls over time, and the best I can do is use what would have produced good results in the past.

11. Hi Eric,

Question, is it possible that the high voter turnout for advanced voting might have affected the actual results for the Conservatives. The polls indicated higher Conservative support around the time of advanced voting than they did at the time of election day.

CRS

12. CRS,

Yes, it could have. It may have been a very important factor.

However, the Liberals were also doing better at the time, so using the same logic we should have expected the Liberals to out-perform the last week's polls, not under-perform.

13. I don't see how advance voting would have impacted the final polls unless those individuals were excluded from the poll.

They'd still be a portion of the sample population and a portion of the overall population as well. Provided that polling was done correctly, advance votes should have had no impact whatsoever - if anything, it should have made the later polls more accurate as, for a certain subset of the sample population, they would have responded with 100% confidence on how they had voted.

14. That assumes people don't change their minds. You may vote one way on the advanced polling day, but when asked how you would vote "today" a week later you might respond differently.

15. Eric ... "That assumes people don't change their minds. You may vote one way on the advanced polling day, but when asked how you would vote "today" a week later you might respond differently."

You really think that people who vote in advance are at all likely to change their minds during a few days? At some point in time, common sense and examination of psychological biases have to kick in rather than thoughts of pure theoretical possibilities.

Look at the results of the polls ... few people in general actually seemed to have changed their minds during the election (10-15% or so over the course of the entire campaign).

The number of advance voters that change their mind, and report on it, between voting and polling is going to be insignificant. Anyone with any knowledge of psychological biases will know that once someone performs a committing act like voting, they're incredibly unlikely to report that they've changed their mind in a short period of time ... and that's for general situations, let alone something like the election where so few people changed their mind during the campaign.

16. Eric,

Interesting as always.

Are you still dating the polls based on the mid-point of their collection period during the election? If so, wouldn't a 23% daily reduction mean that longer polls (such as a week) would be weighted almost to zero by the time they're released?

Have you experimented at all with non-linear reduction in weights over time (for example, 10% for the first few days, then increasing to 25%, etc)?

JamieH

17. JamieH,

That's an interesting point about the polls with longer field dates, but if we look at the Harris-Decima and Nanos polls that were released on May 1, we see the weighting in BC would have been 1.19 for Nanos (taken over the last two days) and 0.41 for Harris-Decima (taken over the last two weeks) with the 23% daily reduction.

That sounds about right to me, rather than having both Nanos and Harris-Decima be weighted according to May 1.

I have not experimented with non-linear reduction, but I imagine that based on the inaccuracies of the polls no method will give perfect results.

18. I think that one useful feature for the future would be to Monte Carlo the results. Currently you have a mean value for the votes in a given region, but given the MOEs and your weightings you also have an estimated uncertainty. So if you sampled party distributions consistent with that uncertainty and then folded those through the seat projection, you could get a range of outcomes in the seats that would more properly indicate what we know.

19. Thanks for your work. It's been a pleasure reading it. I can understand why you didn't try to quantify the uncertainty of your projections - that's a lot of work.

But as for projecting the popular vote, I don't understand your approach. An average using decreasing weights with increasing age is a projection that assumes a slope of 0 in the trend over time. Why would you assume a 0 slope? The NDP had a very positive slope in the second half of the campaign, and the liberals had a slightly negative one. You could use a linear regression in a recent window, or a local regression or some other smoothing technique that isn't prone to high error at the ends of the time series. It may, or may not be, necessary to adjust such a projection for each party's "turn-out efficiency," if the variance across parties within elections is significant and the variance within parties across elections is low.

Just my 2 cents.

20. AverageCanuck06 May, 2011 17:45

EKOS and HD need to be dropped from your model.

Try re-running your calculations without them and you'd have been A LOT close to the real results.

When you say your model "would not have called the Conservative majority - just like the polls did not" you're incorrect.

Lots of polling firms were within the MOE of a CPC majority.

Ekos and HD were not.

Ipsos Reid, Nanos, and Angus Reid have been very good the last few elections.

21. In some ridings the outcomes of the election were very uncertain and appeared AFTER advanced polls closed. Ridings like Esquimalt - Juan de Fuca were thought to be Liberal - Tory races that turned into a tight NDP - Tory race with an NDP win. The NDP rise in Quebec came later into the campaign... and many strategic voters may have changed their decision if they hadn't of voted earlier in advanced polls.

So it's definitely possible that advanced poll voting could skew things. But here's a question... what's the average/regional ratio of people voting in advanced polls versus election day? And has this number changed over the years, particularly this election? Important data for campaigns to know as they develop their GOTV literature and strategies.

22. Hi Éric. I keep coming back to this fascinating site.

If I may, I'd like to amplify a couple of suggestions from other readers in this thread.

1. I enthusiastically agree with '05 May, 2011 23:49' that "that you should project on every ... poll". What makes this site great is its seat projections, but at the bottom of the page I see hundreds of percentage lines from pollsters and only a handful of seat projections.

2. I also agree with jbailin (06 May, 2011 13:34) that a "useful feature for the future would be to Monte Carlo the results". i.e. based on the margin of error in the poll, you could produce a 95% confidence interval for each seat projection, among other confidence levels. You could also estimate the probability of party x winning a majority, or the most seats, or a top-2 finish, etc.

Combining these suggestions, you would then have more seat projections and other probability estimates than input polls. This is what we crave. :-)

-ST

23. Not sure who said it ... "In some ridings the outcomes of the election were very uncertain and appeared AFTER advanced polls closed."

I'm not sure that you're following my statements on advance voting. What I'm saying is that people who vote in advance will respond to the the polling in the same way as they vote. Thus, the polls don't have to be weighted to reflect different periods of voting times.

Frankly, I'm stunned that this is a contentious topic.

24. I will try again.

Your projection model will not work if you use polls that do not reflect reality.

There are pollster that would have shown the CPC well under 40% on May 2 or 3rd or today.

For example do you believe if EKOS let loose their dialing program today it would find CPC support over 35%?

25. EKOS over-estimated Conservative support in 2006.

26. AverageCanuck07 May, 2011 16:01

EKOS over-estimated CPC support in '06 by .7%. Which is well within the MOE. Most pollsters were very good in the '06 election.

In '08 and in '11 they were off by 3% and 5.7% respectively.

Why are you defending Frank Graves ?

Let them take their lumps.

They've blown it for 2 elections in a row and need to go back to the drawing boards and tweak their system!

27. In the situation we had of extreme variability/volatility Eric I don't care what methodology you use it won't be right !!!

Like it or lump it just the time shift is sufficient to screw the whole thing up !!.

28. Re EKOS & Frank Graves.

Mr. Graves certainly has a "history" re the Grits and Tories, but we should judge him on his results.

EKOS was using what they called "IVR Tevhnology" this year, essentially the "Robo-polling" that Rasmussen, SurveyUSA and PPP use in the United States.

The sample response patterns of robo-polling with recorded voices versus traditional operator assisted methods took a while to sort out on the US, I suspect it will take a while up here as well.

Lets give Graves the benefit of the doubt and chalk it up to a learning curve on new technology.

In 2015 if he is out by 6% again we can then start claiming either bias or incompetence, but lets give hi a break on the first kick at the can with the new technology.

the Vorlon

29. AverageCanuck08 May, 2011 16:50

Vorlon didn't Graves use robo polls in '08 as well ?

I'm all for second chances but this is twice in a row that he's blown it.

Something about prompting Green and Other seems to drain CPC support and inflate their numbers.

30. @Average canuck...

The HUGE issue that canadian pollsters have not sorted out is actually the "likely voter" models.

For the last 4 elections, turnout has been between 58% and 65%, it just simply makes no sense to poll ALL voters when only 2/3rds, or less, will actually vote.

NANOS "kinda" gets a bit around this problem by NOT prompting, he just asks which party they support. (In the industry this is called the "unaided ballot")

Somebody who is ~~truly~~ likely to vote and actually a solid candidate to actually show up and vote on election day, generally speaking anyway, should, at minimum, know the party they intend to vote for.

This is why EKOS and others inflate the greens so badly, people who don't know/don't care/are not gonna vote anyway will often, when presented with a prompted list, just randomly pick a name. If you include the "Fred Party" on a list, "Fred" will get 5%...

In the US, the shift between a poll of adults and "likely voters" is typically in the 3-4% range, which is just about exactly the margin the Tories were systemically underpolled by in both 2008 and 2011.

WE know that senior ciytizens (65+) have an almost 80% voting rate. We know that those under 25 vote at less that half the rate.

These are facts. NOT factoring this relaity into your sampling is just bad science IMHO.

31. I'd agree that checking how likely a person is to vote should be a big factor in polling. When I heard some polling firms were not adjusting for the fact under 25's vote at a much lower rate than over 65's I instantly went 'crap' (Green supporter, so was hoping vote mobs might help push party above the lower levels, but if kids are weighted as if they will vote at the same levels as seniors then the best hope was for a minor drop from the polls).

As a person with a degree in stats who does surveys for a living (not political ones) I was very surprised at the poor use of controls used by the Canadian firms. It seems to be dumb luck if they hit the election results dead on more than skill. You have to balance for age, gender, financial status and any other variables that affect the odds of a person voting (riding is a big factor too). Some of those variables might be too weak to bother with, but age & odds of voting should be factored in as strongly as making sure you balance the regions properly (ie: not having 500 of the 1000 from PEI or something).

32. Peter Woolstencroft09 May, 2011 12:28

Eric: are you able to provide the vote totals and percentages for the parties in and out of Quebec?

Thanks so much - and for your always interesting discussions. Even tually one will be impaled on a poll!

33. Eric,

Have you considered a stepped reduction in the polls weight ?

Example :
15-10 days old polls : reduced 35%
9-5 days old polls : reduced 26%
last 4 days polls : reduced 19%

I just made up the %s but I hope you understand my suggestion.

Immanuel

34. I've considered it, but I don't think there is a perfect system. The key is to find the system that will work best most of the time. I prefer to keep it simple when possible.

35. As long as you're using any data from old polls in your weight, there is a contradiction in your method.

A poll gives the current understanding of how people are likely to vote at that specific point in time. When you pull in old polls to the weighting, you are removing evidence from today's situation and replacing it with data that is out of date. You either believe that polls accurately reflect that point in time or you don't. A mix-up between the two (using old and new data) is basically saying that you don't believe current polls are accurately representing what they're supposed to.

Data mining and looking through past patterns may find ways that old polls can be used to predict election results, but this is little different from noticing that past Super Bowl outcomes can be used to predict the evolution of the stock market.

The only use for older polls is to check / verify if current polls are the 1 in 20 polls that is an error or if they accurately reflect what is happening. However, in this case you have the polls from the other firms to verify this, so there is no additional information provided by the older polls.

Rather than focusing on the weighting of older polls, a methodology that would be more in-line with the data you have is to understand how to appropriately weight only the most recent polls (i.e. congruency with other polls is not factored in - if one poll is outside the MOE of all other polls, it's treated the same as if it were in-line with other polls - better practice would probably be to discard it). It also seems that you consider inaccuracies to be a randomly distributed and that there is no bias in the results - this may / may not be true, but is likely worth investigating.

COMMENT MODERATION POLICY - Please be respectful when commenting. If choosing to remain anonymous, please sign your comment with some sort of pseudonym to avoid confusion. Please do not use any derogatory terms for fellow commenters, parties, or politicians. Inflammatory and overly partisan comments will not be posted. PLEASE KEEP DISCUSSION ON TOPIC.