Accurate Election Forecast

Post-Election Nailed it, Initial Review of Forecasts

Our final forecast was for Trump to win the election with 306 electoral votes. Though still not 100% declared, it looks like Trump will win 306 electoral votes. This election has been an unprecedented success for ‘newer’ data and analytics and an awful one for those relying on traditional methods and consensus views. It is a time of reckoning for many pundits, forecasters and analysts.

As described in the above previous post, the day prior to the election every major political forecaster, analyst, and betting market projected a relatively easy Clinton victory, with probability exceeding 90% for many. In this environment, it was extremely difficult to forecast a Trump victory, let alone a resounding electoral win. Ridiculed for having made such an ‘out-of-consensus’ call, I cannot deny the feeling of vindication today.

There are many lessons to be learned from the 2016 election. From an analytical point of view, it boils down to understanding the underlying question/problem, creating your analytical framework around the current (not historical) environment, remaining fact-based leveraging the most relevant datasets, and not being swayed by herd-mentality. Unfortunately, it seems like the most established forecasters and pundits failed to varying degrees on each of these elements.

A complete critique of the election results and of our own and other’s analytics is not possible as data is still trickling in from a variety of states. But it looks like things shaped up along the lines that we had advised/warned about over the past year. Principally:

Social Media and On-line Activity Data – ignore these at your own risk. People show you what they are thinking and their interests by what they do on social media and on-line. These datasets are large, fast, and unbiased. They called the primaries before voting began and the general election months in advance. Why people have ignored these obvious sources of data is mind-boggling,
Rustbelt / Brexit State ‘Surprises’ – highlighting surges near the election in pro-Trump social media, searches and on-line activity in these states which overlapped with relatively pro-Trump demographics, we pointed towards states as WI, MI, OH, PA, NC, NH, and IA as states that would surprise in favor of Trump even though many of the then-current polls did not look as promising,
Spike in Turnout – we pounded the table repeatedly for people to take note and adjust their forecasting models. Though data is still trickling in, it looks like many turnout records will be broken this election. The fact that forecasters did not incorporate such a scenario and instead stuck with historical turnout models to forecast the 2016 election was just obviously wrong,
Whites without College Degrees – shown in multiple posts to be a linchpin for this election, they were mostly ignored in forecasting models, even though they were repeatedly shown to have exceptionally high enthusiasm for Trump and to vote. In a specific post, we even modeled how this single demographic could swing the general election, only one other forecasters to my knowledge did any scenario based analysis on turnout for this demographic, which of course ended up being key to the entire election,
Obama’s Coalition Falls Apart – in many posts using hard poll data, we reviewed how Obama’s Coalition would not turnout or vote to the same degree for Clinton – specifically focusing on African-American, women, and youth cohort. The fact that forecasters used results based on the previous elections and not based on the current environment is unconscionable.
Interest and Enthusiasm Levels – on-line activity, polls, rally attendance, record-breaking primary voting, and debate viewership all pointed to incredibly high interest in this election, which strongly leaned in favor of Republicans. The fact that this data was not leveraged more for forecasting is astounding, especially in a race that pointed towards high turnout,
Social Desirability Bias – many analytical and political professionals disregarded the notion of a ‘hidden Trump supporter’ as a figment of the imagination. In our posts, it was repeatedly quantified using many different US political datasets (congress approval, Obama’s approval rating, Clinton vs Trump, Republican primaries, pro-Clinton bias, and high undecideds) and shown to be an academic topic worthy of discussion and not a frivolous one used for political gain. By completely disregarding it, forecasters failed to include all available (and provable) data into their models, frankly an amateur mistake.

The 2016 election will certainly go down in history as one that upended many preconceived notions of political analysis and forecasting. However, the data needed to correctly forecast this election was available and (not to be immodest but) these insights were repeatedly hammered in posts and research on this site. The expected refrain of ‘impossible to call’ is frankly unacceptable; the poor forecasts of others were completely avoidable.