For Elections, Is Social Data More Predictive Than Traditional Polls?

Vote buttons

Phil Burch Phil Burch, Former Contributor

One critical aspect of democratic societies is the importance of public opinion polling. Polling data allows the political leadership insight into the public’s perception of issues and events. Without these polls, the public would not have the ability to make their voices heard when critical policy decisions are being made and discussed.

When seen through this lens, it is critical that democratic societies trust the validity of Public Opinion Polling. There is evidence that suggests that current polling methodologies are in crisis. In the last three years, there have been several notable examples of public opinion polls failing to predict election outcomes.

Recently, the New York Times and US News And World Report have both reported extensively on these shortcomings.

Joseph P. Williams of US News and World Report cites three key examples of traditional polling falling short:

  • In the days before the 2012 Election, Mitt Romney was essentially tied with President Obama – Romney lost by more than 5 points nationally.
  • In 2014, Mitch McConnell of Kentucky (the current Senate Majority Leader) was thought to be in a dogfight for his Senate seat by Alison Lundergan Grimes; McConnell crushed his opposition (56.7 to 40.7) at the polls.
  • And in 2014, Scotland was in a neck and neck race to stay with the United Kingdom – Scotts overwhelmingly voted to stay a part of the UK

These three examples have left many pollsters to ask ‘why were we so far off?’ It appears that the methodology of traditional polling has been the primary issue with polling accuracy.

The New York Times article cites two primary reasons for these shortcomings: the proliferation of cell phones (and the severe reduction in landline telephones particularly in the millennial demographic) and a general lack of willingness to participate in surveys.

The cell phone issue is particularly problematic. Due to the 1991 Telephone Consumer Protection Act (and subsequent rulings by the FCC) it is illegal for pollsters to use automatic dialers for cell phone numbers. The NY Times article above cites that it can take 20,000 manual dials to hit the necessary 1,000 person random sample. As such, this means that the costs associated with polling have gone up drastically – which naturally reduces the amount of quality research that can be done for these topics.

Looking To Social Data

Over the course of the next several months, I am going to start to investigate if social media data from Twitter (using Sysomos MAP) and Facebook (using Sysomos Scout) can be a better overall indicator of voter sentiment than traditional polling data.

For this first round, I have gathered insights from the Country as a whole, as well as data from Iowa, New Hampshire, and South Carolina since these are the first three primaries in February.

I used the timeframe of 11/25/2015 thru 12/14/2015 as my first timeframe for analysis – I will then revisit these as we get closer to the Primaries and general election to see if Social Media volumes and insights can be a more predictive indicator than traditional polling.

My methodology included using simple searches of the candidates names and titles (Hillary Clinton and Bernie Sanders for the Democratic candidates and Ted Cruz, Donald Trump, Marco Rubio, Ben Carson and Jeb Bush for the Republican candidates) for both Facebook and Twitter. I have also included polling data on the Twitter graphs for further context.

Here is my first foray into this analysis below.

Democratic Candidates Twitter Conversation Volumes vs Polling Numbers

The above chart shows Twitter conversation volumes versus traditional polling data for Hillary Clinton and Bernie Sanders. Here we can see that Clinton is ahead as she is polling at about 54% while Sanders comes in at 33%. However, when we look at the Twitter conversation volume around each candidate, Sanders leads the way and has consistently over the time I analyzed.

Conversation Volumes of Democratic Candidates on Facebook by Age

When we also look at the age of those talking about these two candidates on Facebook we see that Sanders seems to be talked about much more by those in the age ranges of 18-34. And those in the age range in 35-44 seem to be talking about candidates almost equally.

If we go back to the thought above that many millennials may be missing out on the traditional polling calls because of their affinity to have cell phones and not landlines, we can start to infer that the polling data may be missing out on this large group of the voting public. With the younger voice, that seems to be more interested in talking about Sanders than Clinton, missing from the traditional polls, we may learn soon learn that when it comes time to voting, these younger voters may make a huge difference in the outcome. If Sanders can mobilize these younger voters, we may see another example of where traditional polling is way off from the actual outcome that we’ll see.

In another interesting position we have the Republican party. When I looked at the data around the party I found that Donald Trump leads in both social conversations, on Twitter and on Facebook, and the latest polling numbers.

Republican Candidates Twitter Conversation Volumes vs Polling Numbers

Share Of Voice of Republican Candidates on Facebook

However, I also looked closely at the sentiment around all of those social conversations about Trump and found the majority to be negative.

Twitter Sentiment of Donald Trump

The Republican candidate race will prove to be the most interesting because of this. While there are high social conversation volumes around Trump, albeit negative talk, we would start to think that he has a largely negative image to the public. However, social data can’t discern whether posts come from Democrats versus Republicans.

There is a very good chance that all of this negative talk about Trump could be coming from Democrat supporters who will actually have no say in whether he becomes the official Republican candidate. At the same time, those polled through traditional means on the Republican candidates do have a say on which candidate they support. So, in the case of this party, no matter how much negative talk we see about the candidate coming from the social world, it may not actually change the results to be far off from what the traditional polls are showing. I’m going to keep an extra vigilant eye on this race during my follow up reports.

My goal will be to predict the outcomes of the primary elections in February based on this social data to see if social volumes can be a better indicator than traditional polling methods. So keep watching this space for more updates!