Predictor and Explainer for the New York Republican Primary

Today, both the Republican and Democratic parties will be holding their primaries in New York. What should we expect from the Republican primary?  Briefly, the most likely outcome is that Donald Trump will walk away with more than 80 of New York’s 95 delegates. The exact number will depend on the share of the vote that Trump gets, both statewide and in the various congressional districts. However, there are 27 congressional districts, which is a large enough number that we can predict the distribution of results, without worrying about the details of individual districts. That, along with New York’s delegate allocation rules, gives us the following curve:

tny

538’s final polling average has Trump at 52.1%. We can adjust this based on the relationship between Trump’s polling numbers and his final performance, but the adjustment turns out to be pretty small. Best guess is for Trump to get 51.9% of the vote, which would give him about 84 delegates.

That’s the headline. If you’re interested in the details, keep reading.

The red line is the predicted relationship and the black dot is the specific prediction. The blue line is an alternative predicted relationship based on a lower variance among congressional districts (details below).

Delegate Allocation

New York has a total of 95 delegates for the Republican primary. Fourteen of those are allocated based on the statewide results. The other 81 are allocated based on the results in the various congressional districts.

Statewide Delegates

If one candidate gets more than 50% of the statewide vote, they get all 14 of the statewide delegates. If not, delegates are allocated proportionally among the candidates receiving at least 20% of the vote. That’s what creates the discontinuity in the graph above. If Trump gets over 50%, he’ll get all 14. If he gets just under 50% (and Cruz and Kasich both get over 20%), he’ll probably get seven of them.

That graph assumes that both Cruz and Kasich get at least 20%. In the unlikely and very specific scenario where, say, Cruz gets 19.9% and Trump gets 49.9%, Trump would get about 5/8, or about 9 of the statewide delegates. That’s not a big difference, but it is sort of interesting that, in New York, it might be that a strategic anti-Trump movement would want to split their votes between Cruz and Kasich to ensure that both get over the threshold. That runs counter to the conventional wisdom in most states, which says that anti-Trump voters should rally behind whichever non-Trump candidate is more viable.

Congressional-District Delegates

Three delegates are allocated on the basis of the results in each of New York’s 27 congressional districts. Here are the rules that are relevant to the current election:

If one candidate receives more than 50% of the vote, they get all three delegates.

If not, the leading candidate gets two delegates, and the second-place candidate gets one.

Given Trump’s commanding lead in the polls, it seems likely that he will win most or all of the congressional districts. His delegate haul will depend primarily on the number of districts in which he gets over the 50% mark.

We can guess at this if we assume that the individual congressional districts are Normally distributed around a mean given by the statewide result. What we need, then, is a standard deviation for that distribution.

If we look at earlier Republican primary results, we find that the standard deviation among congressional districts clusters around two values. Most of the states for which results by congressional district are available have standard deviations of around 4, including Missouri, Mississippi, Texas, Tennessee, Oklahoma, Arkansas, and Alabama. But Wisconsin and Georgia both had standard deviations of around 7. My instinct is that New York is more like Wisconsin and Georgia, and the curve shown above uses a standard deviation of 7.

If we use a lower standard deviation, like 4, we get the blue curve in the figure above. The upper part of the curve gets a bit steeper. If Trump gets over 50% statewide, reduced variance among congressional districts means that he falls below 50% in fewer of those districts.

The lower part of the curve is fairly insensitive to the standard deviation. If Trump is below 50% on average, higher variance means that he gets over 50% in more congressional districts. However, it also means that he is more likely to come in second in other congressional districts. On the whole, it’s a wash. The curves above assumes that Trump comes in second in any CD where he receives less than 37% of the vote.

Prediction for Wisconsin Dem Primary: Sanders by 8%

Going into today’s Democratic primary in Wisconsin, the polls have Sanders with an edge of 2-3 percentage points over Clinton. The obvious expectation would thus be that Sanders will win by about two or three points. Except that the polls have been wrong this primary season. I’m not talking about Michigan, where polling missed the election outcome by 20 points — or not just about Michigan anyway.

Previously, I did a simple linear regression on the primary results to date, comparing the final polling averages and projections to the actual outcome. I’ve updated that analysis with the more recent results, although the result is more or less the same:

Dem Primary Regression 4-4-16
Regression of final projections from 538’s Polls Plus model versus election outcomes in the Democratic primaries. The red square is the prediction for today’s Wisconsin primary, where the final polling numbers have him up by about three points.

When we compare the final polling averages to election outcomes, we find two things. First, the regression line has a slope greater than one: in the states where Clinton was expected to win big, she mostly outperformed expectations; in states where Sanders was favored, he tended to overperform. This sort of effect might come from a variety of places. For example, it could be that late-deciding voters tend to go with the candidate they think will win, or it could be that voters supporting a candidate very likely to lose become demoralized and stay home. Or, probably, lots of other things.

Second, there is an offset of about 5 points. That is, in close races, Sanders does about five points better on average than the final polling numbers would predict. I am fairly certain that this offset is due to a mismatch between the voter turnout models used by pollsters and the reality this election. Specifically, this election seems to have significantly higher turnout among younger voters compared with previous years. And those younger voters heavily favor Sanders. So, when a pollster constructs their final numbers using a model based on 2012 turnout, they underestimate the number of young Sanders supporters.

The final projection from 538’s Polling Plus model has Sanders winning 50.2% of the vote in Wisconsin to Clinton’s 47.2%. Plugging that into the regression formula gives a Sanders margin of victory of 8.3%.

In the Democratic primaries, delegate apportionment tends to track pretty close to the raw vote totals. So this would project Sanders to win about 46 of the state’s 86 pledged delegates.

In order to go into the national convention with half of the pledged delegates, Sanders needs to win a little over 56% of the remaining delegates. An eight-point win today would give him about 54% of today’s delegates, which would leave the race in pretty much the same state it has been in for a while: Sanders has a shot at overtaking Clinton, but you would not get anything close to even odds on it.

One methodological note: the regression presented here does not include any of the results from the last two rounds of primaries and caucuses — most of which Sanders won by large margins — because there was little to no polling data for those states in the run-up to their elections.

One final note: the conventional what-passes-in-political-punditry-as-wisdom is that Sanders does best in caucus states, while Clinton does better in primaries. That may be true, but all of the points included here are from primary states, with the exception of Nevada. (If we exclude Nevada, the projection changes only slightly: to Sanders by 8.8%.) So, even if there is a primary versus caucus effect, the fact that caucus states received so little polling means that is does not have much effect on this analysis.