Fundamental Forecasts Needed

Election 2016          Problems with the Current Election Analysis: Overly Reliant on Polls and Pundits, Dearth of Actual Forecasts, Create better Forecasting Models

Current election analysis is basically a two pronged approach. Pundits control the qualitative analysis and polls provide the life blood for analyst’s quantitative analysis. Unfortunately both of these approaches have glaring weaknesses and limitations. Election analysis needs to become less reliant on polls and pundits and focus more on ground-up forecasting of ‘election-drivers’ (and of course ‘newer data’ as social media and on-line activity which will be covered elsewhere).

Pundits are impressively biased, though many try to hide or cover their biases. Many of the people appearing on political news shows or writing about the election in a publication have some very partisan experience or interests. It is quite common to see former staff of politicians, or politicians themselves, being interviewed to give their opinion on the race. Wouldn’t they have a slight inclination to provide a lopsided view?

In other cases, activists for a particular political cause, employees at a think-tank, a political consultant, a current or former campaign manager, and reporters from a partisan-leaning media outlet are all common guests and commentators. The conflicts of interest are astounding. Of course their discourse is biased. Furthermore, these (biased) individuals are often portrayed as being opinion leaders with specialized political knowledge to which the public should listen.

As for polls, they are excellent sources of information and it would be foolish not to use them. But, analysts incorrectly see them as the truth and have become overly reliant on basing their entire analysis on them. Polls, as we see in many posts on Social Desirability Bias, can become very biased in emotional elections, creating distortions that are difficult to detect unless you understand the signals. But, potential biases in polls are just one aspect of the problem. The second is that poll responses are re-weighted by a non-transparent method which is presumably different for each poll agency but normally is based off of things like turnout of various demographic groups from previous elections. In this sense, many polls are backward looking which should be fine as long as the current election will be extremely similar to previous ones (which does not seem to be the case in 2016). The third is that analysts take polls and input them into an election model where polls are by far the most important (if not the only) variable used. In essence, these analysts are just compiling polls – which they likely don’t fully understand, they just see them as being the standard of truth in politics and input them into mathematical models.

Although these weaknesses appear glaring, they go more or less unnoticed by the general public. What we are left with is extremely biased commentary and weak forecasting models with nobody really complaining too loudly for an alternative.

There are multiple solutions or at least patches to these problems. Transparency is essential to improving the quality of current pundit and poll analysis. These topics are covered in a different post, but essentially a recommended solution is radically increasing transparency along the lines of what has occurred within the financial industry over the last few decades. In a sweet ironic twist, the ‘political industry’, which regularly goes to the financial industry and financial professionals for campaign donations and then just as regularly publicly decries them as devils incarnate, will admit its non-transparent nature and copy transparency standards created by financial professionals. If they don’t do this willingly, it could very well be forced upon them at some point after a public scandal created by non-transparency.

As for improving political forecasting models, the solution as well lies with the financial industry. Financial analysts live and die with their forecasting models that they use to create every sort of projection for the companies and industries they cover. The base of these models include ‘drivers’ which are the primary building blocks that produce success for the company. For instance, assume the company being analyzed is a retail company and the analyst is trying to forecast annual results. The analyst would, before trying to make any forecast of the annual results, identify the variables that help to determine the annual results. Drivers could include economic drivers (like forecasts for GDP growth, retail sales growth, inflation), revenue drivers (forecasts for unit sales prices, unit volume, new store openings), cost drivers (forecasts for wages, rent, taxes), and any other group of variables that impact what the analyst is trying to forecast. This is obviously not a complete list, just enough to provide an idea of what financial analysts try to identify and then forecast in order to come up with improved overall forecasts. Part of the concept is to break the larger forecasting problem into smaller, more manageable ones that allows analysts to better understand the underlying conditions.

Political analysts need to produce similar drivers for this election. A true forecast of election results should include turnout drivers (forecasts for growth rates of various demographic groups, interest in the election per demographic group, social media volume activity, turnout rates for target groups), voting intention drivers (forecasts for voting intention per demographic group, on-line activity favoring the candidate, on-line activity regarding the election topics), bias drivers (forecasts for differentials between live and anonymous polls, levels of Social Desirability Bias, allocating undecideds voting preferences), and any other drivers impacting the actual election results.

In its most basic form, political analysts’ forecasts should be focusing on turnout and voting intention (given that bias is still not recognized by many political analysts it would not likely be included in most models). Their models should be diving deep into turnout and creating a variety of scenarios around different assumptions. The 2016 election by most measures will be significantly different from previous elections due to the potentially large swings in voter turnout per demographic groups especially as related to recent elections.

Some extremely obvious questions around 2016 voter turnout for some different demographic groups from a very cursory review of election news:

Will African-Americans’ interest in the 2016 election repeat that from 2008 / 2012 without Obama on the ticket and to what extent will this change the landscape for the Democrats?

Immigration has been an extremely important topic in 2016, how will this impact Hispanic-American voter turnout and voting inclination?

Having the first female candidate of a major party in 2016 could influence voter turnout for women – already the largest voter demographic group – to new highs, will it and to what extent could this change the race?

Under-educated white males who normally do not vote are disproportionately supporting Trump. What are the chances that they will actually come out and vote as opposed to just going to free rallies and what type of impact will this have and in which states?

Political analysts should be creating models to answer these, and of course other similar questions, as they amount to ‘drivers’ for this election.

Turnout of any of these four demographic groups could radically change the outcome of the election. But the way analysis is conducted so far, these hard questions receive superficial coverage but there is little digging into them. Analysts need to approach the drivers as individual problems and then use each study as a building block to solve the bigger issue of forecasting the election results.

A building block approach also helps to uncover potential errors with the analysis as criticizing, for instance, a forecast turnout ratio for African-American voters is much easier than to do the same for a final election forecast as there are many more moving parts involved it makes questioning such a conclusion almost pointless unless you have access to all of the base data used (which is not normally the case for election analysis).
Currently, analysts appear to be blindly taking output from polls (which make a series of non-transparent assumptions around things like voter turnout) and just putting these summarized numbers into a national or state-level electoral college model. Though based on lots of numbers and ‘precise’ calculations, it really amounts to back of envelope estimates – as the political analysts are fully reliant on assumptions made by others and they do not appear to understand the underlying issues.

To at least partially solve this issue, included in this series of posts are reviews of analysis, assumptions, and forecasts for turnout for a variety of demographic groups that will most likely influence the race. These are not meant to be exhaustive studies but logical conclusions based on available data. They are here to provide an idea of what professional political analysts should be doing and to provide some insight into why it appears many of the headline figures from current polls appear to be distorted.

By breaking down the larger forecasting challenge into smaller ones, an analyst can more effectively pick apart forecasts of others.