The 2016 presidential election is by all accounts one of the most unusual and anxiety-producing in American history. The Donald Trump candidacy is literally unprecedented. A Trump victory may be unimaginable to many but it is not impossible. Can we forecast the outcome of this extraordinary election based on methods and models developed for past elections?
For decades economists and political scientists have constructed and tested forecasting models for presidential elections. Their objective was not to explain why all Americans vote as they do but to predict the outcome of the next election based on identifying the forces responsible for changes in the vote won by the two major parties. They have sought to specify and measure the fundamental drivers of electoral change as a basis for making an explicit public prediction well before the election. They form a sort of baseline from which we can judge whose election it is to win and whose it is to lose.
The vote as a verdict on the overall economy
Some of the initial forecasts, most prominently those of Yale economist Ray Fair, rely on a large set of diverse economic indicators, some of which have to be estimated before the forecast is released. The idea behind the model is familiar one—the incumbent party will be punished for bad economic times and rewarded for especially good economic times. Fair also uses an indicator of the incumbent party’s consecutive time in the White House, based on the historical evidence that incumbent parties face formidable hurdles in extending their consecutive hold on the White House beyond two terms. Those models eschewing any public opinion measures of the current political climate or standing of the candidates have been most susceptible to large errors in predicting both the winner and the incumbent-party share of the two-party national vote for the presidency.
Adding a snapshot of public opinion
Political scientists have been more parsimonious in their use of economic indicators—typically relying on a single indicator tied to conditions in the election year. That’s because voters tend to be myopic when it comes to evaluating economic performance, in effect asking “what have you done for me lately?” Economic growth in the first half or the second quarter of the election year is one such measure, although real disposable income per capita and an index of leading economic indicators have advantages as well. Importantly, most of these forecasting models have added a public opinion measure that captures the political standing of the incumbent party, even if the president is not a candidate in the upcoming election. One good measure has been the presidential approval rating in June or July of election year. Some scholars have instead use trial-heat polling measures of the two major-party candidates taken relatively early in the campaign.
These latter models of fundamentals incorporating some measure of public opinion have on average had better track records in forecasting the winner and the size of the national vote. For example, Alan Abramowitz’s “Time for Change” forecasting model, based on the incumbent president’s net approval rating at midyear in the Gallup Poll, the growth rate of real GDP in the second quarter of the election year, and whether the incumbent president’s party has held the White House for more than one term, produced the most accurate prediction of the 2012 presidential election among this set of forecasting models.
This year, however, the release of forecasts based on these models just before Labor Day did little to reduce the uncertainty. There was no consensus on who would win much less on the size of the victor’s margin. Abramowitz’s model forecasts a narrow Trump victory but he expects it to fail this year. Why? The model assumes that both parties will nominate mainstream candidates capable of unifying their parties and that both candidates will run reasonably competent campaigns. The nomination of Trump by the Republican Party, in Abramowitz’s view, appears to violate both assumptions. The two models that include a polling measure on the Clinton v. Trump race both forecast a narrow Clinton victory. In contrast, Helmut Norpoth, using only a single predictor—the two nominees relative performance in the New Hampshire and South Carolina primaries)—forecasts a high likelihood of a comfortable Trump victory while Fair’s model predicts a near landslide Trump victory. In fact, Norpoth recently suggested, without the modesty usually associated with academic modeling, that a Trump victory is “a virtual certainty.”
Note that all of these models forecast the presidential election based on the winner of the national two-party vote, and do not factor in third-party candidates. And they assume that a majority or plurality of the popular vote will produce the 270 electoral votes needed for victory. Those are reasonable assumptions based on the historical record but, as William Howard Taft and Al Gore would testify, not foolproof.
Data-intensive, continuous polling
Another approach to forecasting is to estimate the electoral vote outcome directly, by using sophisticated statistical analyses of mostly state and national trial-heat polls conducted continuously throughout the general election campaign. Since 2004 biophysicist Sam Wang has conducted a meta-analysis of state presidential polls to provide more accurate snapshots of the race for a majority in the Electoral College. His Princeton Election Consortium (PEC) blog hosts reports on his forecasting model, which beginning in 2012 added specific predictions. Nate Silver of FiveThirtyEight was a pioneer in this development. He initially forecast the 2008 presidential election. His blog became an essential resource for election watchers and remains so today. When Silver moved from the New York Times to ESPN in 2013, the Times under Josh Katz developed its own polling-based forecasting model as part of The Upshot platform. Finally, statistician Drew Linzer, who has written the most comprehensive scholarly case for his state-based forecasting model, introduced his Votamatic site for the 2012 election. Linzer’s model became part of Daily Kos website in 2016.
The Upshot conveniently provides daily updates from the four models of the probability each attaches to the predicted winner of the presidential election nationally and in each of the fifty states. As this blog was written, all four predicted a Clinton victory with probabilities ranging from 67 to 90 percent; the three reported electoral vote forecasts range from 298 to 324. Clinton has maintained a consistent lead in all of the models since midyear, although the size of her lead and the likelihood of her victory has fluctuated, more in some than in others.
Understanding the models
Detailed descriptions of each model are provided on the individual sites. The models are similar in a number of respects. They all use state polls and then statistically adjust them to deal with outliers and to reflect recent trends. They also use past election polling data to improve the accuracy of their use of current polls. Three use simulation techniques to estimate the probability of state election outcomes; PEC uses a measure of the median state poll. Three of the models use polls collected by HuffPost Pollster; FiveThirtyEight aggregates state and national polls itself.
But the models also differ in a variety of respects. For example, what polls to include; whether some polls are discounted on grounds of quality or House effects; whether and how to factor in any measure of the fundamentals of the current cycle and past election results; the details of the statistical operations; and what information is made available publicly. FiveThirtyEight provides the results of three forecasting models: its default polls-only forecast, a polls-plus forecast that includes economic and historical data, and a now-cast forecast, presenting the likelihood of a candidate winning today. The polls-plus is the most stable over the course of the campaign but the three models are expected to converge as we get closer to the election. Today Clinton’s odds of victory range from 66 to 68 percent across the three FiveThirtyEight models, about the same as they were in June.
I mentioned above that HuffPost Pollster provided the polling data for most of these state-based forecasting models. Real Clear Politics also compiles publicly available national and state polls and reports simple averages of recent ones. Pollster applies its own criteria in deciding which polls to include and then calculates poll “averages” differently by using a statistical smoothing technique involving many simulations to build trend lines into the estimates of national and state polls to reduce the “noise,” mostly the impact of outliers. Three of the forecasting models (not including PEC) use a similar technique. FiveThirtyEight makes additional adjustments in these estimates. All of which reminds us that where the race for the national and state popular vote stands at any point in time is not a known quantity but a number of approximations based on different aggregating procedures. To the extent these estimates remain stable or move in the same direction, we can be more confident. At the very least, polling “averages” are more reliable than individual polls.
One final point on polling-based forecasts. As rigorous and sophisticated as these models may be, they all have to work with imperfect materials. Polling has become more difficult and polling methods more diverse in recent years. Because of cost and timing, in-person interviews based on area probability samples are not available for election forecasting. Probability-based telephone polls, including those sampling cell phones as well as land lines, face incredibly low response rates, typically in the single digits. It is a leap of faith that those who respond to polls reflect the values, beliefs and preferences of those who do not. Weighting the respondents with Census data to reflect the size of key demographic groups in the voting population, such as age, gender, race, education, and income, is essential and helps improve the representativeness of the data. But it doesn’t deal directly with potential differences on important variables within these demographic groups between respondents and non-respondents. As an illustration of this point, some scholars have found that fluctuations in reported vote intentions are more a consequence of who agrees to be polled than real changes in vote choice. Automated, voice recorded telephone polls confront these problems and many more. Internet polls are increasing in frequency and many rely on opt-in, non-probability samples that require even more elaborate weighting to represent the relevant population. Some of the very best internet polling firms have developed careful procedures to try to counter the bias of not having a random sample, both when respondents are selected and when the data is weighted to make it more representative. Unfortunately, these are more the exception than the rule.
And then there is the problem of identifying likely voters. There is no consensus among polling firms and sponsors on how best to do this and post-election research matching polling respondents with official voting records is not particularly promising.
Additional Forecasting Tools
To supplement the forecasting models based on pre-campaign fundamentals and polling-based state predictions, other sources of data are being utilized. These include:
- Betting markets. PredictWise, a composite measure included on The Upshot summary of presidential forecasts, describes itself as follows: “The backbone of predictions on this site are market-based, generated from real money-markets that trade contracts on upcoming events. (See the piece by Phil Wallach on prediction markets.)
- Panels of experts. The Monkey Cage, a political science-blog on the Washington Post website, has teamed up with Good Judgment, Inc. to host a forecasting tournament on the 2016 election. Readers of the blog were invited to become participants. Good Judgment has found that ordinary people can be excellent forecasters and “crowd-sourcing” among thousands of interested observers can produce forecasts that are more accurate than any other indicator.
- Wisdom of crowds. Citizen expectations of who is likely to win an upcoming election can be more predictive than their responses to questions on how they intend to vote.
- Models of models. A meta-analysis of similar forecasting models (Ensemble Bayesian Model Averaging) or a composite forecast based on a diverse set of forecasting models (PollyVote).
Those looking for certainty in the upcoming presidential election will not be heartened by this review of forecasting models. The polarized structure of American politics shows no sign of being undermined or even sharply altered by the candidacy of Donald Trump, in spite of the warring forces within the GOP. The partisan shape of the electoral landscape in 2016 is strikingly similar to that of 2008 and 2012. Democrats have been boosted by the changing composition of the electorate over the last several decades, most notably the increase in nonwhites and the declining size of the white working class. They enjoy a modest advantage in partisan identification (by Pew’s measure, about four points) and now begin each presidential campaign with a substantially larger bloc of relatively safe electoral votes than do the Republicans. But the strongest party voting in decades, reinforced by intense racial and cultural tribal loyalties, guarantees more competitive presidential elections than in years past. In spite of almost unprecedented opposition from Republican Party elites, Trump appears positioned to capture the lion’s share of his party base.
The context of this election apart from the candidates and their campaign is mixed but probably nets out as a small Republican advantage. After eight years with a Democrat in the White House, voters are hankering for change—whatever that might mean to them. The economy has recovered from the Great Recession and unemployment is below five percent but growth is tepid and wages still suffer from long term stagnation. The latest news that U.S. middle class incomes grew faster in 2015 (5.2 percent) than any year in modern history (and the poverty rate fell rate fell by 1.2 points) suggest objective conditions for many families are improving, but subjective measures of present and future economic well-being continue to trail. And try as he might, Barack Obama couldn’t avoid being a polarizing president. Nonetheless, his job ratings and personal approval have been in positive territory this election year.
A generic Republican facing a generic Democrat would likely produce a very close election. But no one would accuse Trump or Clinton of being generic candidates. In spite of the intense support he enjoys at large rallies and on social media, Trump is deeply unpopular—and widely viewed by the public as unqualified for the office and temperamentally unsuited to be commander-in-chief. For most of the campaign, that has produced what Vox called a “Trump tax,” a measure of the extent to which he is running behind the fundamentals.
While Clinton is accurately viewed as being highly experienced and informed, with a temperament that fits the job, and faces no opposition from leaders within her own party, her unfavorable ratings are just about as bad as those of Trump. Saturation media coverage of her private email server and the Clinton Foundation have taken a serious toll on her public standing, and the “Trump tax” has declined in recent weeks.
The state-based polling models, betting markets, panels of experts, wisdom of crowds and composite models of diverse forecasting models all point to a likely Clinton victory. But if Trump were to close the polling gap nationally and in the swing states in the next few weeks, a pattern not witnessed this late in the campaign cycle in presidential polling since at least 1952, the forecast of most of these models will move with him.
The debates and fall campaigning (especially in light of her lead in fundraising and campaign organization) should reinforce Clinton’s lead and produce a modest victory, not a landslide. The smaller her victory, the less likely Democrats will regain control of the Senate and make solid gains in the House. A Republican Senate would create a huge obstacle to a President Hillary Clinton from day one. (Note that many of these forecasting models predict Senate election outcomes; some do so for the House as well. But that is a subject for another blog.)
What these models cannot do is reduce all uncertainty. The conduct of the campaign and, just as importantly, its coverage in the media during the remaining weeks of the campaign, hold open the possibility, not likelihood, of a President Trump. If the odds of that prospect increase, some longtime Republican voters troubled by Trump’s candidacy may decide to see that it does not come to pass.