Monday, May 27, 2013

Sampling, weighting, and transparency in public polls

If the election in British Columbia has convinced me of anything, it is that there needs to be a greater culture of transparency and disclosure in the polling industry and a more healthy relationship between pollsters and the media. Those in the media (including, peripherally, myself) need to ensure that the polls they are publishing are of the highest standards possible, and a policy of full disclosure on the part of polling firms will help those in the media better judge what potential issues might exist in any given poll.

There also has to be a greater understanding of what makes a good poll. Building a representative sample is an absolutely basic requirement. Understanding who is most likely to vote in polls conducted during an election campaign is something that needs to be improved and emphasized going forward.

Last week, I took a look at the role turnout plays in accurate campaign polling. Today, I'll look at the question of building a representative sample, and the role disclosure plays in knowing whether a sample is representative or not.

Late last week, The National Post published the results of the latest federal poll from Forum Research. It showed the Liberals at 44%, the Conservatives at 27%, and the New Democrats at 20%. Aside from a three-point drop for the Tories, more or less straddling what would be a statistically significant shift, those numbers were mostly unchanged from Forum's last federal poll released just after Justin Trudeau became Liberal leader. But the numbers are eye-popping, to say the least.

The full report for Forum's poll was shortly thereafter published on Forum's website. There are some things to praise in Forum's disclosure policy. The firm seems to release virtually every crosstab it has, from age to religion to ethnicity to marital status and more. Few polling firms do this, including some of the most well-known and successful in Canada. Nanos Research, for example, shows only the sample sizes of its regional numbers in its national polls but no other demographic breakdowns. Harris-Decima and Angus-Reid report plenty of demographic results, but no sample sizes beyond the number of people polled nationally.

From that perspective, Forum's disclosure policy is among the better ones in Canada. But from what I can determine the sample sizes that Forum reports in its tables are unweighted. That it isn't easy to figure out what the sample sizes represent is unfortunate, that Forum doesn't also include its weighted samples is more so.

Here again, Forum is in good company as several other polling firms have similar disclosure policies. But only Ipsos-Reid and Abacus Data routinely release both the weighted and unweighted results and sample sizes of their polls, what should be considered the bare minimum of disclosure acceptable to the media that is publishing polling results. In the United States, this is common practice.

Some polling firms are reluctant to release weighted results because they consider their weighting schemes to be proprietary. That is within their right - but when polls are for public consumption and have the ability to influence the national discussion and election campaigns, they have a responsibility to be upfront about their methodology.

Since Forum releases its unweighted sample data, we can take a look at its ability to build a representative sample with its interactive voice response methodology. We do not have this ability with many other firms, and that is a problem. That is one reason why Forum is the unlucky one to have its polling dissected in this analysis. Forum is also the newest national pollster to be regularly releasing polls and it is - by far - the most active polling firm in Canada.

The weighting is the hardest part

Who answers a poll conducted by automated telephone dialing? Forum's report gives us a clue, and the results are revealing. Respondents to Forum's most recent federal poll were older, richer, and whiter than the general population.

The chart above (and those below) shows the difference between the proportion of respondents in each age category in Forum's latest poll compared to the most recent data from the census. As Forum's report says, "where appropriate, the data has been statistically weighted to ensure that the sample reflects the actual population according to the latest Census data". From what I can tell from Forum's report, this is what was done.

And it would have had to have been. Fully 60% of respondents to Forum's poll were 55 or older, whereas this cohort only makes up 33% of the Canadian adult population. Only 9% of respondents to the poll were between the ages of 18 and 34, whereas they make up 28% of the adult population. Those are big differences.

Does that matter? The weighting takes care of all that. And 18-34-year-olds don't even vote.

Yes, it does matter. To have a good poll you need to be able to build a representative sample. It is very difficult to get a perfect sample every time, so weighting is used to round those hard edges. Weighting is not supposed to be used to completely transform a poll - and the fact that Forum had difficulty reaching younger people suggests there are issues with the methodology itself. If it was a truly random sample, there wouldn't be such a discrepancy. The question of reaching the voting population is a separate issue - polls need to be able to reach everyone, and then likely voters can be pulled from the larger sample.

And what we are seeing is not Forum's turnout model in action (or is it? Again, we don't know because it isn't explained). Forum's methodological statement itself said the sample is weighted to match the census, and if this was its turnout model it would not resemble any sort of turnout that has occurred in recent elections.

Apart from building a random sample, the size of the samples that are put together is something to consider as well. The total number of respondents aged 34 or younger was 158 in this poll of 1,709 decided or leaning voters. Assuming a random sampling of the population (a stretch for any methodology that does not call both mobile phones as well as landlines and calls-back repeatedly if the first attempts fail), the margin of error for a sample of 158 people is about +/- 7.8%.

But once weighting is applied, the sample will be inflated to represent about 479 people with a margin of error of +/- 4.5%. However, the sample remains only 158 people with the larger margin of error. Any errors that might have crept into the small sample will be amplified as the sample itself is re-weighted to about three times as many people. This is why Forum tends to have larger samples than other polls - it needs to in order to have usable demographic samples. Still, even a standard poll of 1,000 people should reach about 280 younger Canadians, not 158.

On the flipside, Forum polled 1,025 people aged 55 or older, a sample that would have a margin of error of +/- 3.1%. But that sample will have to be re-weighted to instead represent some 564 people (which normally has a margin of error of +/- 4.1%.) So Forum's poll has a much better idea of how older people will vote, but unnecessarily so. If the actual support of younger Canadians falls on the far edges of the margin of error of that small sample, the potential for a very inaccurate result increases as the weightings are applied.

The chart below shows the unweighted and weighted samples as well as the relevant census numbers for Abacus Data's latest federal poll, which was released in April. Because of Abacus's disclosure policy, we have all the information we need to run these calculations and determine how Abacus is weighing its data. We can also see that it is possible to build a broadly representative sample, in this case via an online panel.

The discrepancies between the weighted and unweighted samples here are relatively small (though here again, reaching younger Canadians is difficult - a hurdle that every polling firm is trying to get over). The amount of weighting that is being applied, and thus the amount of potential distortion of the results of the poll, is accordingly minor. This is what any media outlet publishing a poll should be able to figure out for themselves before publicizing the numbers.

(Note: the small differences among the 30-44 and 60+ age groups between the weighted and census numbers is due to the influence of other weights, i.e. gender, region, etc.)

It might not be Forum's fault that younger people won't answer automated polls. Perhaps that is one of the reasons it has been able to have some success in recent elections: its methodology's limitations mimic turnout in some fashion. But that would then be a happy accident, not a feature, and could become problematic if turnout changes in some dramatic way (i.e., how younger Americans came out to vote in 2008).

Whose landline is it anyway?

But it is not just about reaching people of certain ages. The sample sizes reported in Forum's latest poll show divergences from the population on other measures as well, giving us one indication of the kind of people who answer IVR polls.

Wealthier people are very happy to answer Forum's polls. While those households who make under $100,000  were adequately sampled (though most  were somewhat under-sampled), those households with an income of $100,000 or more were far more likely to answer the phone.

Married and widowed people also appear more willing to respond to a poll. Single people were far more unlikely to respond to Forum's automated poll, one assumes because they were out on the town. But were these numbers re-weighted to correct for the discrepancy? In the case of the unattached, the same issue of amplifying potential problems of small samples is present.
On religion, the Forum sample was good (though non-Catholic Christians seem to be slightly more willing to pick up the phone). On ethnicity, however, the discrepancies were quite large.

But this is one measure that is hard to figure out. The most applicable census data would be the single ethnic response, since that is what Forum is asking respondents to do. In the last census where this data was available, 32% said they were ethnically Canadian of some kind, whereas 64% of Forum's respondents, or double, did the same. Those who were 'other European' were under-sampled by half - one assumes that includes French Canadians -  while those who were not Canadian, British, or European in ethnicity were under-sampled by two-thirds. That is perhaps the most important number, since it is important to get a hold of ethnic minority communities in polls. What weightings were applied for these numbers? We don't know because the weighted sample sizes are not included.

And what kind of effect might the various re-weightings have when they are applied one on top of the other?

Then there are the memories of past voting experience. This is a problem for pollsters, as people might not remember how they voted or their memories might be playing a trick on them (confusing, say, a by-election or a provincial election for the last federal election). Choosing whether or not to account for this in an election campaign is tricky, and this was one issue grappled with by pollsters in British Columbia.

As you can see, the Forum sample had more Liberals than it should have and fewer New Democrats (it was about right for the Tories). Does Forum apply a weighting for past voting behaviour? We don't know, so we are left to wonder if it is taking this skewed sample into account or not.

But Forum is almost penalized for its disclosure policy. Because it releases unweighted sample sizes we can ask these questions - and that is a good thing. For all we know, those pollsters who are not releasing unweighted data could be building even more unrepresentative samples. This is why disclosure and transparency is so important. Before publishing, the media should be asking for this information if it is not given up front.

Let the record show

But with shrinking newsrooms and increasing responsibilities, journalists don't have the time to wade through this data, so they often publish what they get. With all the resource constraints in the industry, it is hard to blame them for it. And, in any case, how can you argue with a track record? Forum has done as well or better than most other firms over the last two years.

Because I am focusing on Forum to such a great extent it would only be fair to take a look at how its polls have done since the firm first emerged in 2011. A free promotional opportunity:
All in all, Forum's track record in its first three elections was quite good, while its results in the last three elections was good compared to competitors. A few notes about these numbers, though:

Forum was the closest pollster active in the final days of the Alberta campaign. I have it ranked second out of eight pollsters here due to an Ipsos-Reid poll that was out of the field just before the campaign began, but within the 30-day window I use to rank pollsters. Nevertheless, Forum still had Wildrose ahead, and based on the information that has come out of Alberta since the election Forum, and the other firms active over the final weekend, should have been able to capture at least some of the swing earlier.

In Quebec, Forum has emphasized that it was the only firm to place the Liberals as the Official Opposition. That is true, and to Forum's credit it was able to gauge Liberal support better than the Quebec-based pollsters. But Forum also showed that the PQ would easily win a majority government. CROP and Léger were showing that the PQ was not in a strong majority position, and that the role of runner-up was generally up for grabs.

And in British Columbia, Forum did get close to the final result but the numbers were outside of the margin of error of its final poll and, unlike in Alberta and Quebec, the firm was out of the field with six days to go. Forum has said that since its seat projection hinted at a Liberal majority government, it was the only firm to project a Liberal victory. As projecting seats has nothing to do with polling capabilities, this is irrelevant in comparing Forum's numbers to other pollsters. And Forum has made no explanation about how it does seat projections, despite repeated requests, and the report gave no details about how its numbers came about.

In fairness, all pollsters try to portray their track record in the most positive light. And it is unfair for polling firms who were not active in Alberta or British Columbia to claim, in the context of those misses, that they have a better track record than the firms that participated in those campaigns. We do not know if they would have done any better, so the comparison is apples to oranges.

Disclosure, transparency, responsibility

A policy of full disclosure does few favours to a polling firm. It opens them up to criticism and gives competitors the chance to 'reverse-engineer' their weighting schemes. Better, then, to give as little as possible since the numbers will be published anyway.

That is the crux of the problem. The media is not in position to demand more transparency as many outlets don't pay for polls (especially outside of a campaign), but they absolutely must. Free or cheap polls get good traffic, generate a lot of interest, and are 'exclusives', so they are hard to turn down. It is also difficult to turn a poll down on principle because of non-disclosure when another newspaper will publish the numbers without question. They are in an understandable bind.

This is what should change. We need to demand more disclosure and more transparency before publishing. Pollsters showing their work in greater detail will have all the more reason to ensure that they are doing it well. The media has a responsibility to report accurate information and to ensure that the information it is getting is reliable. Journalists take great pains to do so on other matters, so it should be natural to treat polling the same way.

Pollsters also have an incentive to publish everything. When they get it right, no one can claim that they were lucky. When they get it wrong, the numbers they published will probably have some hint as to why they got it wrong - it would be better to point to previously published numbers and say that they should have paid more attention to them, than to put out unpublished numbers after the fact to explain away errors. And a polling firm that is confidently publishing its data is a polling firm easier to trust.

Democracy benefits as well. Accurate public polling is a necessity in a free democracy as otherwise campaigns would be dominated by the leaked and spun numbers of internal polls, and voters would cast their ballots with less information than they should have available to them.

A culture of transparency will help foster a climate of quality, responsible polling - and make it easier to call those out who are doing a bad job. We saw this in the United States, where some polls were under-sampling and under-weighing visible minorities, over-estimating Republican support as a result.

I have thought about my role in this debate, and seriously considered not writing about polls that do not release full data. But this site can do a better service by drawing attention to these issues instead of ignoring a large chunk of polling done in Canada. Hopefully, the election in British Columbia will act as the catalyst for positive change in the industry and for those who write about it.