Based on current polls, Princeton professor Sam Wang projects Donald Trump Will Win the Republican Nomination with a median of 1,356 delegates, 119 more than he needs at the convention.
Wang tweeted ...
How did Wang arrive at such an incredible prediction?
Invalid Use of National Surveys
According to Wang, "A central problem is therefore how to construct the natural variation in state-to-state support. This can be done using national surveys, based on the fact that the national average contains respondents drawn from across the U.S."
Using National Surveys Wang concluded Trump had a "92% of the probability of winning 1,237 delegates or greater", but reduced that total accounting for "Cruz bias" to "70%, about 2-1 in Trump's favor".
As logic would dictate, and year-to-date results show, basing state projections based on national polls is a very poor statistical idea.
What does strength in New York or Maryland have to say about strength in other totally unrelated states like Indiana or Colorado?
In a comment on his blog, I very politely pointed out the flaws of using national polls to predict state outcomes. I referenced a recent post of mine and another by the New York Times.
He immediately removed my comment. I tried a second time and he removed that.
I emailed Wang and asked him if he even bothered to read my analysis or that of the New York Times.
He did not respond. Instead, being the "open-minded" professor that one might expect in the US education system, he actually banned me from even reading his tweets!
Banning someone from reading your tweets is certainly a low class act.
However, Wang's blog still remains open so I present this snapshot.
I bounced Wang's post off Salil Mehta at Statistical Ideas.
Mehta responded ....
I found it interesting that this individual is taking some thoughtful measures of previous empirical distributions, but then relying heavily on applying "standard deviations" to future election outcomes. Natural election outcomes (especially a composite of these late-stage primary states) tend to not to share symmetrical bell-shaped properties.
Another issue is in the treatment of the 1.2 Cruz handicap. Earlier in the primaries when there were many more candidates and Cruz was a severe underdog, it is far easier to generate a high-ratio handicap. (See my article Getting Trumped).
Now with just 3 candidates, and Cruz with a much later portion of the preference, it is clearly more difficult to get the same high-ratio handicap. The author states that since Cruz this season has been at a certain handicap, it is "fair" to merely apply this same bonus to future projections. This assumption appears to be a significant problem, but one the author can rectify in future modeling!
Three Wang Errors
Given Mehta's analysis, it's pretty safe to conclude Wang made at least three major errors (one I pointed out and two by Mehta).
If Wang wishes to do so, I will gladly post his state-by-state projections to see where he gets his numbers from.
I projected a Trump win, but barely citing three key states: New York, Indiana, California, beating the New York Times to that analysis.
Nate Silver Update
The Business Insider reports NATE SILVER: Donald Trump's Chances of Locking up the Nomination Appear to be Dwindling.
That headline is based on Silver's State by State Roadmap from which the following chart was derived.
Nate Silver vs. Mish
I appear to be just shy of the mark. But I gave Trump 15 of 125 uncommitted delegates. He may not win many, but he won't lose them all. With those 15 delegates, Trump eeks out a win on the first ballot.
I am down from my original total as Trump did not pick up the seven I assigned from Colorado.
Pennsylvania could easily provide a decent surprise if Trump needs another dozen. After that, Trump has a huge problem, but so does the GOP if Trump blows out Cruz in the majority of states but does not win the nomination.
Indiana is the key unknown. Silver originally projected 37. Silver now has 9 as the "deterministic" single most likely number, in a "probabilistic" range that widens to 22.
On what did Silver make this adjustment?
Last month, our panel gave Trump an average of 37 delegates in Indiana, which implied that he's the favorite to win there. I don't think I can agree with that after Wisconsin, however. The states are relatively similar demographically. In Indiana, as in Wisconsin, Trump doesn't have much support from statewide elected officials. Moreover, the Midwest as a whole has been a middling region for Trump. Earlier in the calendar, he got away with some wins in the Midwest with a vote share in the mid-to-high 30s. Now that the field has winnowed and Republican voters have learned to vote tactically, he'll often need 40 percent of the vote to win a state instead.
Still, some caution is in order. There's been no polling at all in Indiana. Perhaps Trump can hope that his momentum from April 26 will carry him to victory, or that Kasich will drain a few votes from Cruz in counties that border Ohio. My deterministic projection has Trump losing -- although salvaging a few congressional districts -- while the probabilistic one is more equivocal. The path-to-1,237 projection has Trump winning, almost out of necessity, because it will be hard for him to carve out a path to 1,237 delegates without the Hoosier State. Revised Trump delegate projections: deterministic 9; probabilistic 22; path-to-1,237 48.
There are no polls. Silver is guessing and so am I.
If Trump loses Indiana and that big block of 30 statewide delegates (the remaining 27 are at the district level), it will be very difficult for Trump to come up with 1,237.
Given the vital importance of Indiana, one would think pollsters would be all over the state. But here we are.
Trump will win or die on the results of California and Indiana.
The quote of the day goes to Salil Mehta "the only real forecast that is >99% accurate is the one that is made the day after the election"