Election Forecasts

Election 2016:          Summary, Election Forecasts, Category Forecasts

The final Coogan forecast is based on a variety of data and techniques summarized in this post and explained further in other posts.  These forecasts point towards a Trump election win and are considerably out-of-consensus with the current traditional pundit and analyst forecasts.  They will be updated until election-day to reflect changes in underlying data and assumptions.  A summary of the candidate popular vote forecasts are as follows:

 

Table 1:  Forecasts for Popular Vote and Margin of Victory for 2016 US Presidential Election

(Market)

(Coogan) (Coogan) (Coogan) (Coogan)

(Coogan)

Candidate

Polls

Post 2nd Term Identity Politics Social Media Search Trends

Social Desirability Bias

Clinton

45.2%

44.1% 44.0% 48.0% 46.0%

44.5%

Trump

42.3%

45.9% 46.0% 50.3% 50.0%

48.5%

Johnson

9.5%

8.0% 8.0% 1.1% 3.2%

5.0%

Stein

3.1%

2.0% 2.0% 0.6% 0.8%

2.0%

Margin of Victory

2.9%

-1.8% -2.0% -2.4% -4.0%

-4.0%

Source:  RealClearPolitics, Wikipedia, Google Trends, ZettaCap

 

The Polls data shows a win for Hillary Clinton which translates into a slightly smaller Margin of Victory at 2.9% as compared to Obama’s 3.9% win over Romney in 2012.  In other words, Clinton is projected to more or less replicate Obama’s last victory.  The Polls data is a simple average of the national polls over the last month from the poll consolidator RealClearPolitics (“RCP”).  This data has been corrected to take out undecideds by reallocating them on a pro-rata basis to the various candidates based on existing poll levels.  (For those of you who prefer the RCP average as it is often quoted in the news, it amounts to 3.2% in favor of Clinton at the writing of this post.)  In short, the Polls column shows the current collective knowledge of the national polls and has consistently pointed towards a Clinton victory barring very brief dips into Trump territory.

On the other end of the spectrum, we have the Coogan projections shown in the table which point towards a Trump victory.  These are extremely out-of-consensus forecasts as the collective assumption among neutral analysts is that Trump has an extremely slim chance at winning or even not being blown-out.  Most odds-makers and quants place the probability of a Clinton win at 80% to 90%.

As explained in a series of posts found on this site, polls appear to be heavily biased in 2016 making them questionable as the only inputs into forecasting models.  The Coogan forecasts instead rely on newer data such as social media and on-line activity, on correcting for an assumed Social Desirability Bias, on forecasting demographic group voting, and on historical election analysis.  These topics are very large and not capable of being fully discussed here, but a summary is provided.  They are presented in the order of the net Margin of Victory and not by any preference.

Post Second Term

Successor candidates of a two-term president normally do not perform as well as the two term president at the ballot-box.  This rule-of-thumb has been confirmed to work after looking at almost two centuries of US election data.  It also passes the common sense test which frankly is important as well.

The idea is that a political party always puts forth its strongest candidate first and then nominates an adequate but likely not as strong candidate as the president’s successor.  At the same time, much of the populace, after eight years with the two term president, is open to a new message which usually means a message from a different party.  Basically, a two term president’s successor can get elected but only if their predecessor was extremely popular as he/she likely will receive few votes than the predecessor.

In all of the historical cases from the US, only one shows a successor candidate performing better at the ballot box than the two-term predecessor.  This was the case of Teddy Roosevelt being elected after President McKinley’s second term.  However, this was no usual case as McKinley died in office soon after his second inauguration, giving Roosevelt more than three years as president before having to run for president his first time.  So, in reality, Roosevelt was likely viewed by the public as an incumbent during this election.  In short, the case of Roosevelt is the exception to the rule but it also appears to be very much different as well.

In every other case outside of Roosevelt, the successor candidate of a two-term president performed worse.  The average of all the cases was a 12 percentage point decline in the popular vote margin of victory.  Using just the last three examples (Reagan to Bush, Clinton to Gore, and Bush to McCain), the average decline amounted to 9.4 percentage points.  In other words, if the two-term president did not win their re-election by a fairly good margin (by around 10 percentage points or more) then it is not very likely that a successor candidate will win the follow-on election.  The lower the margin of victory was in the predecessor’s re-election, the worse the chances for the successor candidate.

In 2016, Clinton is the successor candidate for Obama.  Obama won his re-election by 3.9 percentage points, which is a historically tight margin.  Given historical precedent, Clinton will likely not perform as well as Obama.  If Clinton outperforms the norm, she will still lose a few percentage points which means in a best case scenario, using this analysis technique, the election will be incredibly close and within the margin of error of the polls.

For the projections in the table, a discount to the average negative swing of 9.4 percentage points was used.  In a normal election a discount would not be applied, but Trump is acting very much like an independent candidate and breaking with much Republican precedent.  In this case, a portion of the electorate that would normally swing from the in-power party (Democrats) to the out-of-power party (Republicans) will likely be less than if the Trump had towed the Republican line.  Regardless, a negative swing against the in-power party should occur as it has in almost all historical US presidential elections, the degree to which should depend on how risky the electorate views Trump in comparison to a plain Republican candidate.

Identity Politics

With heavy emphasis placed on Obama’s Coalition of 2008 and 2012, demographic group forecasts were made for 2016.  These forecasts relied on historical voting and voter turnout trends mixed with a variety of current insights such as enthusiasm / interest levels regarding the 2016 election and confirmed poll biases.

The primary conclusion is that the Coalition will support Clinton but not to the same extent it supported Obama.  In fact, the decline should be significant.  Reading the news and even taking a cursory look at demographic poll data show the support of minority groups and of the youth demographic falling for Clinton.  The African-American community, the Democrat’s most loyal by far, will likely post a substantial drop in turnout and in percentage voting for Democrats as compared to 2008.  Hispanics, contrary to most headlines, do not appear to be supporting Clinton any more than they did Obama and in a somewhat surprising turn appear to support Trump to a greater extent than they did Romney in 2012.  The youth demographic shows distinct signs of dissatisfaction with Clinton and appear to instead be leaning towards third party candidates.

Women appear to hold the wild-card for Clinton.  If women can make up for the aforementioned losses, Clinton could in fact pull off a victory.  However, a variety of surveys show that the enthusiasm / interest level of women in this race is lacking.  Normally, such lower relative levels of interest in the current election translate to lower turnout, not higher turnout which appears to be the base case assumption from many.

In contrast, Trump benefits from demographic group analysis.  Like Obama in 2008, he has attracted a demographic that historically has a lower voter turnout rate – namely whites without college degrees.  In 2008, many analysts questioned the notion of including forecasts of increased turnout for demographic groups favoring Obama as there was little historical precedent.  However, many of these groups hit or approached generational highs for voter turnout in 2008 / 2012.  Similarly, Trump’s whites without college degrees have historical low voter turnout but are expressing intense enthusiasm and interest in the current election.  Turnout from this demographic greatly improves Trump’s chances of victory.

If you are wondering how polls can be so far off from this ‘Identity Politics’ analysis, it is because polls almost universally use turnout assumptions based on previous elections.  So, they re-weight answers from data collection by using backward-looking turnout data from previous elections.  If this analysis of 2016 ‘Identity Politics’ is correct, the weights for Obama’s Coalition are too high and for Trump’s whites with no college degrees is too low.  Just by changing some basic assumptions as to 2016 turnout, Clinton’s win turns into a Trump win.

Assuming turnout shifts considerably which appears to be the case, election forecasting will change substantially from a static approach based off of historical trends to a scenario based one built around sensitivity-analysis given a variety of turnout assumptions.

Social Media

The manner in which people communicate has changed dramatically over the last decade as social media has increased in importance.  This shift has not however been accompanied by the political analysis community which prefers to shun these newer dataset in favor of traditional polls.  This should not be that significant of an issue as conclusions from polls and social media should, in theory, replicate one another.  However, it seems that in this emotional election cycle, many people are responding differently in polls than their true opinions due to a presumed desire to not produce social upset (Social Desirability Bias explained later and in other posts).

Social media has the advantage of being much less filtered and structured than polls.  Many people interact on social media in a much freer way than they would when discussing a topic with a live interviewer over the phone.  By analyzing social media, an analyst can get around much of the presumed social biases involved in emotional elections and can also reach a larger demographic.

The social media analysis presented here is based on work done by ZettaCap, which focuses on leveraging social media and on-line activity for financial market analysis.  Concepts of social media influence have been adapted from financial market analysis to election analysis to create the Social Media Influence index (“SMI”).

The SMI gave strong indications from August 2015 that Trump would win the Republican nomination and that Sanders would compete strongly with Clinton for the Democratic nomination.  These calls remained consistent throughout the nomination process with very limited changes.  For brief periods, Carson’s SMI superseded that of Trump as did Sanders’ SMI of Clinton, but for the most part Trump dominated the Republican pack and Clinton held on to a narrow lead over Sanders.

In the primaries, Trump surprised most pundits by dominating the process as Sanders surprised most by giving Clinton a tremendously difficult fight.  The SMI made these calls well ahead of pundits and even before the beginning of voting, making the SMI the only known quantitative approach to have correctly called the primaries before voting began.

As for the general election, these same concepts have been directed towards the four main candidates.  Trump has a distinct advantage as you might expect given the medium.  However, Clinton has been closing the gap recently and frankly the state of how social media could impact the race is still open.  Many well-known surrogates have become much more vocal in supporting Clinton over the last few weeks and this trend will likely continue until election-day.  Therefore, the current rather wide advantage for Trump could lessen considerably, but his SMI lead should remain intact if just barely.

Another interesting point unveiled by social media analysis is that the third party candidates are really not doing that well, especially in comparison to their poll levels.  Both Johnson and Stein, but especially Johnson, have been performing quite poorly on social media.  This is somewhat surprising given the demographics of those who normally prefer these third party candidates.  The youth demographic clearly is the strongest demographic for both parties – and this same demographic is more apt to use social media, giving these parties a natural advantage when using this metric.  However, the SMI shows that both the Libertarian and Green candidates score well below their poll levels.

Assuming that the actual election results show Trump outperforming polls and the third party candidates underperforming polls, social media analysis for election forecasting should receive a significant boost and may even partially displace polling as a primary source of data.

On-Line Activity

People get their information on-line making such activity paramount for political and election analysis.  In other words, if you want to know which candidate, policies or parties resonate better with the public, analyzing their on-line activity (in addition to social media) provides incredible insights.

In the US, the most visited search engine is Google and the most visited information site is Wikipedia (higher than any news site).  By looking at Google search trends and Wikipedia page views, we can fairly accurately determine which candidate, party, and even policies are generating the most interest.  Judging from historical election analysis, interest and enthusiasm are tightly linked with voter turnout and voting inclination – in other words, the candidate generating the most interest has a significant advantage.  In our modern on-line world, this translates to the candidate with the most on-line interest holds a significant advantage.

In 2016, Trump is generating far more interest than Clinton in both search and page views.  This interest is also very consistent as Trump’s lead has lasted during the vast majority of the election season.  Clinton has on occasion beaten Trump on a variety of on-line activity metrics, but such occurrences were mostly due to specific events such as the DNC and Clinton’s September health scare.  In general, people simply appear to be more interested during this election in Trump and the Republican Party.

These trends are well established by looking at a variety of Google and Wikipedia metrics.  Interestingly, they are confirmed by news coverage on the candidates (as measured by news volume), by social media activity, and by poll questions regarding level of interest or enthusiasm in the election.  All of these variables show that Trump edges Clinton out, sometimes by a very large degree.  As with other forms of analysis, this conclusion becomes stronger when confirmed by multiple data sources – which is what we see in this election.

One of the main weaknesses with using on-line activity for election analysis is that there is limited data from other election cycles.  Google search trends, however, do go back to the 2004 election which does provide a longer term perspective for on-line activity.  With this data we can at least test some of the theories against actual results.

In the three previous US Presidential Elections (2004, 2008, and 2012), Google search trends did an excellent job of forecasting the winner, the margin of victory and relative turnout levels.  If we assume that a higher volume of Google searches equates to higher interest levels, we can also assume that the candidate receiving more searches will likely win the election and that higher aggregate searches for both candidates should indicate overall voter turnout levels.  In the historical examples, the candidate that led in searches also won the election.  The size of the margin of victory in popular vote also appears to have a strong relationship with the relative search lead – meaning a narrow lead in searches appears to predict a narrow popular vote victory whereas a wide search lead appears to predict a wider popular vote margin of victory.

In 2016, Trump’s dominance in search indicates that he should win the election.  Comparing his relative lead in search to previous elections puts this current race as more competitive than 2008 but less so than 2012.  In other words using Google search to forecast the election results, Trump should win by a margin of popular vote somewhere between Obama’s two margins of victory.

The popular vote estimates for the third party candidates was calculated by using a simple search volume relative analysis versus the main candidates.  It is interesting to note that third party candidates score poorly using search metrics, or at least much lower in comparison to current polls.  This trend is also reflected in other metrics such as social media.

Search and other on-line metrics show very similar results to social media analysis.  Trump is expected to well outperform current polls whereas third party candidates are expected to well under-perform.

Social Desirability Bias

The dynamic that explains why an individual would purposefully answer a survey or poll question incorrectly due to perceived or real social pressure is Social Desirability Bias.  It is a well-known phenomenon studied by academics in a variety of areas including surveys on personal finances, private sexual practices, health habits and other areas where people might feel social pressure to change their responses on surveys.  In politics, it is often referred to as the Bradley Effect.

In the 2016, Social Desirability Bias has emerged as the leading explanation of why live polls and anonymous polls could vary to such a degree and why polls and on-line activity could point to such different conclusions.  In short, Trump has been successfully branded as a socially unacceptable candidate and many individuals have modified their poll answers in order to avoid social upheaval.

Social Desirability Bias has impact the race in a number of ways.  The most obvious impact is that Clinton’s poll numbers are superficially higher and Trump’s numbers are lower than their actual support levels.  Third party candidates also have benefited as their poll numbers have superficially increased as many individuals ‘hide’ their true voting intentions by declaring support for these somewhat neutral and socially inconsequential candidates.  Also, the level of undecideds is unusually high for this period of the election cycle implying that many others are simply declaring undecided to avoid social problems.

Forecasting using Social Desirability Bias assumptions is tricky at best.  We can more or less determine the current levels of bias by comparing live and anonymous polls.  Then we can look at historical elections that presumably also had relatively high levels of (unconfirmed) bias to use as sample cases for forecasting.  These historical examples show that in the run-up to the election, votes tend to move towards the least socially acceptable candidate and tend to move away from smaller third party candidates.  Also, the bulk of undecideds tend to end up with the least socially desirable candidate.

Applying this data and assumptions to 2016, the results differ significantly from the polls.  Third party candidates lose a considerable amount of votes as reflected by their relatively weak anonymous polls in comparison to live polls in addition to their weak social media and on-live activity metrics.  Clinton’s support is forecast to remain more or less stable from current poll levels.  Trump is forecast to pick up almost all of the undecideds and those leaving the third party candidates.

The forecast based on Social Desirability Bias is heavily weighted in Trump’s favor, much more so than even the most aggressive social media and on-line activity based forecasts.  However, judging from previous elections where bias is presumed to have existed, similar late surges occurred in the socially less desirable candidate.  In these cases, undecideds also tended to disproportionately move towards that same candidate.

The forecast results in 2016 might appear too different from the current polls to take seriously, but most of this is due to the unusually high level of undecideds.  Most or even all other election forecasters appear to assume that the undecideds will be allocated to the various candidates on a pro-rata basis based on current poll levels.  However, when conducting a forecast based on Social Desirability Bias you assume that the reason for undecideds being so high is that it contains many individuals ‘hiding’ their actual support for the most socially undesirable candidate in the ‘safe’ and socially acceptable undecided category – that is, of course, until election-day.

Assuming Trump wins, Social Desirability Bias, in addition to social media and on-line activity analyses, will help to re-write many of the election analysis rules.