Tuesday, November 6, 2012

Nate Silver and the trials of a forecaster

Election night update: Congratulations to Nate Silver on a job well done!

Morning after update: After a long night and early morning, there won't be any posting today. But I invite you to read my article for The Globe and Mail in which I assess how the various forecasters did. The polls did quite well last night, though it is worth noting that Silver's forecast was better than that of RealClearPolitics, which does a simple averaging of the polls. It all makes me lament for the polling situation north of the border - how jealous I am of the amount and quality of data that is available in the United States!
There is an election today. Millions of Americans will be heading to the polling booths today - if they haven't cast their ballots already - and will be glued to their televisions as the results start pouring in. Millions of people around the world will also be paying close attention to the election results, including a good number of Canadians.

Polls have taken up a large space in coverage of the election, and in the last week or so that focus has been especially directed at Nate Silver, who runs the FiveThirtyEight blog for The New York Times. As Silver's forecasts a few weeks ago still showed that Barack Obama was favoured to win, despite the President trailing by a narrow margin in most national polls, he was heavily criticized for being a partisan hack or simply wrong. Some of these criticisms were well-argued and thoughtful, but most were innumerate or downright personal. The backlash against these criticisms resulted in some of the best analysis of what forecasters do that I have ever read, and even more attention was thrust upon Silver (though I don't think he welcomes it).

As long-time readers of this site know, and as anyone who can look at the URL in their browser and put two -and-two together will figure out, I am an admirer of Silver's work. I was inspired to launch this site because of his own work during the 2008 presidential election, and many of the methods I employed were inspired by his. The idea of weighing the pollsters by their track record, for instance, comes directly from FiveThirtyEight. But after being in operation for more than four years (almost as long as 538 itself), ThreeHundredEight has developed its own distinct character.

But despite very different methods and backgrounds (not to mention track records, though I suspect Silver would struggle just as much with our tricky electoral system and lack of good and plentiful polling), we work in the same field. As a result, a lot of the criticisms that have been aimed at Silver over the last week have hit close to home. Though I have (thankfully) only rarely received the same kind of treatment in national media, and the scale is greatly different, I have seen some of the same kind of criticisms (and insults) on the Internet, in my inbox, and on my Twitter feed, and know what it is like to have your motivations and competence unfairly called into question.

Though this site and others like FiveThirtyEight, and there are several of them in the United States, are thought of mostly as prediction sites, that only scratches the surface. I have always found Silver's analyses of polls to be far more interesting than his forecasts, which have always been based on what the polls are saying anyway. This site tries to do the same thing - providing analysis of polls that goes beyond what can be found in most media coverage, and providing more than just a guess as to what could happen.

Polls take up a lot of place in media coverage in Canada as well as in the United States, and there is a corresponding need for responsible poll reporting in this country. Readers need to understand what polls are able to tell us and what they are not. And they can only gain so much from reading about a single poll in their favourite newspaper. As the US election has made quite clear, different polls can say very different things.

Ten polls released in a short time span each tell 10 different stories. Some of those stories will be an accurate depiction of what is happening, and some of them will not. Focusing on one poll or another will lead people to miss the forest for the trees.

The usefulness of a site like ThreeHundredEight or FiveThirtyEight is in telling the story of what those 10 polls are saying, and not just in terms of aggregation and prediction. Silver takes a macro look at all of the polls and describes a narrative that the polling data - and only the polling data - backs up. I try to do the same here. Our projections are about what the polls are saying will happen based on the information available now and what has happened in the past, not what we think or hope will happen, and both of us are always quick to include plenty of caveats.

And those caveats are incredibly important. As of writing, Silver's model indicates that Obama has a 91.6% chance of winning. That means Mitt Romney could still win, but the odds are heavily stacked against him. With an eye towards the limitations of public opinion polling, Silver is forecasting that the odds that the polls will be wrong enough to give Romney the win are very low. There is little valid argument to make against such a statement.

Looking at our recent electoral history in Canada, the fact of the matter is that in only 44.5% of cases has the polling error been large enough to erase the 2.6-point margin that Silver current gives Obama in the popular vote. And as those errors have an equal chance of going one way or the other, that would give Obama a 77.7% chance of winning the popular vote, based on how often the polling margin between two parties in Canada has turned out to have been proven wrong on election night. Add Obama's electoral college advantage to that, and you can see why Silver gives Obama such good odds. But that still doesn't mean Romney can't win.

I had my own set of caveats in my final projection for the Alberta election, which was very open to the possibility of the Progressive Conservatives winning a majority government. I was heavily criticized at the time for that openness - until the results started pouring in. (Then I was criticized for not giving the Tories a thumping majority to begin with.) In the recent Quebec election, I forecast that the Parti Québécois would win, but the question was whether it would be a majority or minority government. In future elections I intend to emphasize more of this uncertainty.

These sorts of caveats and confidence intervals might seem like hedging, but it is in fact the responsible thing to do. There always needs to be a recognition of what we do not know - and the potential error in polls is one of those things. But there also has to be a recognition of what is most likely to happen based on the information that is available to us, and that the information is going to be reliable more often than not.

Criticisms of the specific nature of Silver's model might be warranted. It is a very complicated model that will probably perform only slightly better than a simpler model most of the time. But there are a half-dozen well-known forecasting models operating in the United States, and dozens of lesser known versions. Each has their pros and cons, but to limit the appreciation of the work that Silver does to the percentages in his charts is to miss out on the real value of his work: objective, fact-based analysis of polling data.

Forecasters are not fortune-tellers - our work is based on the data that is available to us and is only as good as the information we are provided. Some critics expect the sort of accuracy that cannot be achieved by models that are based on the polls. When the polls are wrong, models based on them can only do so much. I would have been raked over the coals if I had claimed that the polls were under-estimating the Tories by 10 points in the Alberta election. Expecting any statistical model to foresee and emphasize those 19-times-out-of-20 outcomes is unrealistic, and reduces their usefulness (if the confidence intervals are wide enough, a model will never be wrong). There are plenty of other places to find predictions based on intuition and opinion alone - and anyone who has studied those kinds of predictions can tell you how valuable they are.

Most frustratingly from this side of the argument, a lot of the critics that have leveled their guns at Silver are well-placed to do so. It is very easy to criticize from the sidelines when nothing is risked, like a heckler at a comedy club who would die of stage fright if he was forced on stage. As long as a critique is reasonable (and not based on something as ridiculous as the sound of Nate Silver's voice), the critic is in a win-win situation. He or she is just showing a healthy journalistic skepticism (especially if the forecast ends up being good). If Silver's forecast turns out to be wrong (and a Romney victory will be seen as such, even if Silver's forecast remains open to the possibility), he or she will have been a clairvoyant.

If Romney wins, Silver will have been proven foolish for having relied too much on those increasingly inaccurate polls. Those who are still calling the race a toss-up also risk nothing, even if Obama ends up winning by a relatively comfortable two or three points. "The race was never as clear-cut as Silver claimed, the undecideds just broke towards the president because of Hurricane Sandy in the last days of the campaign" - you know that argument is coming (and it is already being made).

This no-lose gamble is worth taking if it means "I told you so" gets to trend on Twitter. Everyone wants to be smarter than the egg-head.  Silver's reputation is riding on tonight's result. The reputation of those Doubting Thomases won't be dented by their dismissal of his work - but they'll take the credit if he ends up being wrong.

On top of this, Silver can be dismissed as having been "lucky" if his model proves successful. Colby Cosh of Maclean's made the implication in the last paragraph of his otherwise mostly fair assessment of Silver's track record. But focusing on those occasions when Silver was wrong (how his baseball forecasting model couldn't accurately predict a few exceptional individuals, for example, or his struggles with the UK election) seems to miss the point, particularly from someone who wrote a piece the day before the Alberta election in which it was argued that I was under-estimating Wildrose. I don't mean to pick on Cosh, but merely to point out that, sooner or later, everyone trips over something like the 2012 Alberta election or Ichiro Suzuki. And while Cosh does a good job of showing that Silver is not infallible, Silver would be the first to admit that himself.

But all of Silver's hard work, his thousands of words of analysis, his painstaking attention to detail - and if he's right? Luck. The same people who level criticisms at him for providing probabilities to a one-off event will claim that, if he is right, it doesn't matter because it could have gone either way. It could, of course, but there is a reason why his forecast says one outcome is more likely than the other.

Silver and others in the forecasting business don't have the luxury of chirping from the cheap seats. We bring this upon ourselves, of course, but the best of us are willing to admit when we were wrong and also accept that we could be wrong. A lot of our critics assign a degree of brash, arrogant certainty that most of us simply don't have.

In the end, Silver's forecast is all about the odds. I hope that the cards fall well for him tonight, and if they don't I will still look to him for the best polling analysis anywhere. For my part, this will be the first election in four years that I will watch as a mere observer, and I'm looking forward to a fun night.