Illuminating 2016: Helping political reporters cover social media during the 2016 presidential campaign

Although Donald Trump seems to be getting a lot of love from journalists who cover what he is tweeting about, all of the presidential candidates are tweeting, posting, Snapchatting, and Instagramming. Their social media not only drives what reports say about the campaigns, but also mobilizes their supporters, generates needed cash, and draws out opponents in extended tit-for-tats.

Logo-Comps-02Several projects have begun to visualize the frequency of social media postings by candidates, mostly on Twitter, reporting the numbers of followers and changes in follower rates, for example, but none are looking carefully at what the candidates are actually saying. Part of the reason for that is that it’s easy to count frequencies of already structured data, like number of supporters. It’s much harder to categorize, analyze, and count what they are saying in the unstructured data of tweets and Facebook posts.

That’s where Illuminating 2016 comes in. Our goal is to advance public understanding of what the presidential candidates are saying through their social media accounts. We are doing that by using start-of-the-art computational approaches for studying unstructured text.

We’ve been collecting all of the announced major party candidates’ Twitter and Facebook posts since they declared their presidential bids. In all we have filled 6 servers with 24 presidential candidates social media messages, and of course we’re still collecting. We’ve developed categories for classifying the candidates’ messages, looking at their strategic messages that promote their policies or attack their opponents, their calls-to-action, their conversations with the public, and their attempts to inform. We’ve trained computer models to categorize the messages along 9 categories, and have so far achieved a 70% accuracy with the classifications (by comparing how the algorithm performs when categorizing data where the “truth” has been established). Our categories include: attack, advocacy, image, issue, endorsement, calls-to-action, conversation, information, and ceremonial.

Our next step is to learn more about what political reporters need to help them understand what the candidates are saying on social media. We will be conducting interviews in the month of April. We also need to learn how journalists can best make use of the data we have been collecting and generating to support their job of covering what has become a remarkable election year. In May, we will do user tests with journalists to see what they find useful and what they don’t based on our data and visualizations so far.