For Elections, Is Social Data More Predictive Than Traditional Polls?

Vote buttons

Phil Burch Phil Burch, Former Contributor

One critical aspect of democratic societies is the importance of public opinion polling. Polling data allows the political leadership insight into the public’s perception of issues and events. Without these polls, the public would not have the ability to make their voices heard when critical policy decisions are being made and discussed.

When seen through this lens, it is critical that democratic societies trust the validity of Public Opinion Polling. There is evidence that suggests that current polling methodologies are in crisis. In the last three years, there have been several notable examples of public opinion polls failing to predict election outcomes.

Recently, the New York Times and US News And World Report have both reported extensively on these shortcomings.

Joseph P. Williams of US News and World Report cites three key examples of traditional polling falling short:

  • In the days before the 2012 Election, Mitt Romney was essentially tied with President Obama – Romney lost by more than 5 points nationally.
  • In 2014, Mitch McConnell of Kentucky (the current Senate Majority Leader) was thought to be in a dogfight for his Senate seat by Alison Lundergan Grimes; McConnell crushed his opposition (56.7 to 40.7) at the polls.
  • And in 2014, Scotland was in a neck and neck race to stay with the United Kingdom – Scotts overwhelmingly voted to stay a part of the UK

These three examples have left many pollsters to ask ‘why were we so far off?’ It appears that the methodology of traditional polling has been the primary issue with polling accuracy.

The New York Times article cites two primary reasons for these shortcomings: the proliferation of cell phones (and the severe reduction in landline telephones particularly in the millennial demographic) and a general lack of willingness to participate in surveys.

The cell phone issue is particularly problematic. Due to the 1991 Telephone Consumer Protection Act (and subsequent rulings by the FCC) it is illegal for pollsters to use automatic dialers for cell phone numbers. The NY Times article above cites that it can take 20,000 manual dials to hit the necessary 1,000 person random sample. As such, this means that the costs associated with polling have gone up drastically – which naturally reduces the amount of quality research that can be done for these topics.

Looking To Social Data

Over the course of the next several months, I am going to start to investigate if social media data from Twitter (using Sysomos MAP) and Facebook (using Sysomos Scout) can be a better overall indicator of voter sentiment than traditional polling data.

For this first round, I have gathered insights from the Country as a whole, as well as data from Iowa, New Hampshire, and South Carolina since these are the first three primaries in February.

I used the timeframe of 11/25/2015 thru 12/14/2015 as my first timeframe for analysis – I will then revisit these as we get closer to the Primaries and general election to see if Social Media volumes and insights can be a more predictive indicator than traditional polling.

My methodology included using simple searches of the candidates names and titles (Hillary Clinton and Bernie Sanders for the Democratic candidates and Ted Cruz, Donald Trump, Marco Rubio, Ben Carson and Jeb Bush for the Republican candidates) for both Facebook and Twitter. I have also included polling data on the Twitter graphs for further context.

Here is my first foray into this analysis below.

Democratic Candidates Twitter Conversation Volumes vs Polling Numbers

The above chart shows Twitter conversation volumes versus traditional polling data for Hillary Clinton and Bernie Sanders. Here we can see that Clinton is ahead as she is polling at about 54% while Sanders comes in at 33%. However, when we look at the Twitter conversation volume around each candidate, Sanders leads the way and has consistently over the time I analyzed.

Conversation Volumes of Democratic Candidates on Facebook by Age

When we also look at the age of those talking about these two candidates on Facebook we see that Sanders seems to be talked about much more by those in the age ranges of 18-34. And those in the age range in 35-44 seem to be talking about candidates almost equally.

If we go back to the thought above that many millennials may be missing out on the traditional polling calls because of their affinity to have cell phones and not landlines, we can start to infer that the polling data may be missing out on this large group of the voting public. With the younger voice, that seems to be more interested in talking about Sanders than Clinton, missing from the traditional polls, we may learn soon learn that when it comes time to voting, these younger voters may make a huge difference in the outcome. If Sanders can mobilize these younger voters, we may see another example of where traditional polling is way off from the actual outcome that we’ll see.

In another interesting position we have the Republican party. When I looked at the data around the party I found that Donald Trump leads in both social conversations, on Twitter and on Facebook, and the latest polling numbers.

Republican Candidates Twitter Conversation Volumes vs Polling Numbers

Share Of Voice of Republican Candidates on Facebook

However, I also looked closely at the sentiment around all of those social conversations about Trump and found the majority to be negative.

Twitter Sentiment of Donald Trump

The Republican candidate race will prove to be the most interesting because of this. While there are high social conversation volumes around Trump, albeit negative talk, we would start to think that he has a largely negative image to the public. However, social data can’t discern whether posts come from Democrats versus Republicans.

There is a very good chance that all of this negative talk about Trump could be coming from Democrat supporters who will actually have no say in whether he becomes the official Republican candidate. At the same time, those polled through traditional means on the Republican candidates do have a say on which candidate they support. So, in the case of this party, no matter how much negative talk we see about the candidate coming from the social world, it may not actually change the results to be far off from what the traditional polls are showing. I’m going to keep an extra vigilant eye on this race during my follow up reports.

My goal will be to predict the outcomes of the primary elections in February based on this social data to see if social volumes can be a better indicator than traditional polling methods. So keep watching this space for more updates!

6 Comments on “For Elections, Is Social Data More Predictive Than Traditional Polls?”

  1. Phil – I did some predictive analysis of social sentiment and activity prior to the Scottish referendum. Interesting the social media sentiment was significantly in favor of devolution from the UK. I am still trying to figure out what the core reasons for the discrepancies were. Theories include: 1. Last minute nerves – the Scottish Daily Mail newspaper (most popular in Scotland) ran several stories discrediting the Scottish leadership figures just prior to the vote. 2. Social media activity was largely taken over by vocal Scottish nationalists. The silent majority weren’t on twitter or influenced by Social media. 3. The voting was rigged in favour of staying in the UK 🙂

    1. Hey Larry – Hope you’re well. The Scottish referendum is certainly an interesting case study. It’s interesting to hear that Social Data was significantly in favor of leaving the UK when the election went the other way. This is definitely something I will remember for my (ongoing) analysis of the Presidential Election. The good news is that because of the State Primaries we have in the US, I have about 50 test cases that I can play around with to get closer to a more predictive model before the general election in November. I will be writing the next blog on this topic in the next couple weeks, so it would be great to connect to learn more about the approach you took for the Scottish Referendum (and to catch up in general!). I have a few ideas to help reconcile the Facebook data in particular and would love to hear your feedback. Thanks for the comments – very interesting and helpful!

    2. Hey Larry – Hope you’re well. The Scottish referendum is certainly an interesting case study. It’s interesting to hear that Social Data was significantly in favor of leaving the UK when the election went the other way. This is definitely something I will remember for my (ongoing) analysis of the Presidential Election. The good news is that because of the State Primaries we have in the US, I have about 50 test cases that I can play around with to get closer to a more predictive model before the general election in November. I will be writing the next blog on this topic in the next couple weeks, so it would be great to connect to learn more about the approach you took for the Scottish Referendum (and to catch up in general!). I have a few ideas to help reconcile the Facebook data in particular and would love to hear your feedback. Thanks for the comments – very interesting and helpful!

    3. I remember we did an analysis of the Scottish referendum as well here on the blog – https://blog.sysomos.com/2014/09/16/scottish-referendum/
      At the time, according to the social data it seemed that people were in favour of departing from the UK, but when it came down to the actual vote it went the other way.
      When I looked back it all (not on the blog, just personally) I chalked it up to that silent majority. The people who wanted big change were the loudest because they were really hoping to see the change. Meanwhile, those who were perfectly happy continuing to be part of the UK just went on with their normal lives and said what they thought through their votes.
      While noise in the social space can be a good indicator, there are also a lot of people that don’t tend to use social media or speak up, but their votes still count. I think watching how the US election plays out will be interesting as there is a lot of people making noise about it on social, but there are also a lot of citizens that vote that don’t say anything about it in social.
      In the case of the reality TV shows though, the demographic is a little more skewed to people who want to be part of the action and tweet along with the shows they love, so those silent people don’t wind up actually being the majority in that case.
      It’s going to be fun to watch how this whole thing plays out.

      Cheers,
      Sheldon, community manager for Sysomos

  2. Interestingly reality TV shows are also predicted by Social Media sentiment. I worked with some students a few years ago on Britain’s Got Talent and successfully predicted the favorite act (Susan Boyle) would not win, despite all the press and bookies saying she would. She was runner up. In the case of popular, consumer sentiment, reality TV shows are closely followed on Social Media and unlike some political contests, social media does mirror consumer sentiment.

Leave a Reply