Over at FiveThirtyEight.com, Nate Silver has a post attempting to debunk the idea that there is momentum in political campaigns. But I think he’s wrong. And his post provides a fun opportunity for a simple statistics lesson on the difficulty of discovering momentum.
Here’s what Nate does: He compares the change in polls between successive periods. If there were momentum, he argues, a rise in the polls this month would make it more likely there is another rise next month, while a fall in this month’s polls would likely yield another fall next month. That is, he believes that momentum will cause successive changes in a time series to be positively correlated. Instead, he finds the opposite to be true. (His graph is here.)
But this isn’t a fair test of the momentum hypothesis. Here’s the problem: Imagine that there is some measurement error in the poll taken this month — perhaps the pollsters just interviewed too many Democrats, or too many Republicans. If there’s any kind of measurement error, this could drive Nate’s findings. There are two cases to consider:
If the estimate for this month is too high, it will make the change between this month and last month look too large, and it will make the change from this month to next month look too small (or perhaps negative).
Or, if the estimate for this month is too low, it will make the change between this month and last month look too small, and it will make the change from this month to next month look too large.
Either way, if there’s any kind of noise in your estimate for this month, you get a negative correlation between adjoining changes in polls. But in my example, this negative correlation is driven by measurement error in the polls, not by the presence or absence of momentum.
The problem is that he’s using the same measurement — this month’s poll — in constructing both of the variables he’s analyzing. And this is likely responsible for much of the (negative) correlation he observes. So perhaps this negative correlation is in fact disguising true momentum in political races.
Perhaps a simple example will help. I’m going to use data on the black unemployment rate to make my point, just because these are the data I have handy:
<not-mobile message=”**To view the chart, please visit brookings.edu on your desktop**”>##1##</not-mobile>
It’s crystal clear that there’s substantial momentum here: When black unemployment has been rising for a few months, it usually continues to rise; when it has been falling, it usually continues to fall for a few more months.
But let’s see what happens if we perform Nate Silver’s test, analyzing successive monthly changes in this black unemployment rate?
<not-mobile message=”**To view the chart, please visit brookings.edu on your desktop**”>##2##</not-mobile>
Instead of finding positive momentum, we find that this month’s change is negatively correlated with next month’s change in unemployment! This would lead Nate’s approach to (wrongly) conclude there’s no momentum. The reason is that the black unemployment rate is measured with error, and by construction, these errors cause this month’s change and last month’s change to move in opposite directions.
There’s a simple solution: Analyze changes where you don’t use the same measure — this month’s unemployment rate — in constructing both your dependent and independent variables. I try this alternative test in the next graph, showing the change in the black unemployment rate between next month and this month versus the change between last month and the previous month:
<not-mobile message=”**To view the chart, please visit brookings.edu on your desktop**”>##3##</not-mobile>
And now this analysis shows what was obvious from the first graph: Yes, there is momentum in the black unemployment rate.
Is this what is going on with Nate’s analysis of polling data? I don’t know for sure, because I don’t have his database to test this idea on. But I’m willing to bet it is. Nate: Here’s my proposal. Re-run your analysis, but instead of analyzing the relationship between changes over periods A and B as a function of changes between periods B and C, analyze them as a function of changes between periods C and D. I’m confident that you’ll find less evidence of negative correlation; you may even find evidence of a positive correlation.
In fact, let’s bet a fancy dinner on the outcome — I reckon you’ll find that non-overlapping changes in polling are in fact positively related. That is, there probably is momentum in political races.
Do we have a bet?