Votes on Twitter: Assessing Candidate Preferences and Topics of Discussion During the 2016 U.S. Presidential Election

Social media offers scholars new and innovative ways of understanding public opinion, including citizens’ prospective votes in elections and referenda. We classify social media users’ preferences over the two U.S. presidential candidates in the 2016 election using Twitter data and explore the topics of conversation among proClinton and proTrump supporters. We take advantage of hashtags that signaled users’ vote preferences to train our machine learning model which employs a novel classifier—a Topic-Based Naive Bayes model—that we demonstrate improves on existing classifiers. Our findings demonstrate that we are able to classify users with a high degree of accuracy and precision. We further explore the similarities and divergences among what proClinton and proTrump users discussed on Twitter.

[1]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[2]  Andreas Jungherr Twitter use in election campaigns: A systematic literature review , 2016 .

[3]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[4]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[5]  Robert M. Bond,et al.  Quantifying Social Media’s Political Space: Estimating Ideology from Publicly Revealed Preferences on Facebook , 2015, American Political Science Review.

[6]  Craig MacDonald,et al.  Exploring Time-Sensitive Variational Bayesian Inference LDA for Social Media Data , 2017, ECIR.

[7]  Craig MacDonald,et al.  Topic-centric Classification of Twitter User's Political Orientation , 2015, FDIA.

[8]  Joseph DiGrazia,et al.  Twitter publics: how online political communities signaled electoral outcomes in the 2010 US house election , 2014 .

[9]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[10]  Craig MacDonald,et al.  Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data , 2016, SIGIR.

[11]  Pablo Barberá,et al.  Understanding the Political Representativeness of Twitter Users , 2015 .

[12]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[13]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[14]  Rachel Gibson,et al.  140 Characters to Victory?: Using Twitter to Predict the UK 2015 General Election , 2015, ArXiv.

[15]  Joshua A. Tucker,et al.  Measuring public opinion with social media data , 2018 .

[16]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[17]  Mohammad S. Khorsheed,et al.  Comparative evaluation of text classification techniques using a large diverse Arabic dataset , 2013, Language Resources and Evaluation.

[18]  J Allan,et al.  Readings in information retrieval. , 1998 .

[19]  Casey M. Warmbrand,et al.  A Network Analysis of Committees in the U.S. House of Representatives , 2013, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Samuel C. Woolley,et al.  Algorithms, bots, and political communication in the US 2016 election: The challenge of automated political communication for election law and administration , 2018 .

[21]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[22]  Mason A. Porter,et al.  A network analysis of committees in the United States House of Representatives , 2005, ArXiv.

[23]  Craig MacDonald,et al.  Examining the Coherence of the Top Ranked Tweet Topics , 2016, SIGIR.

[24]  Nicholas Beauchamp,et al.  Predicting and Interpolating State‐Level Polls Using Twitter Textual Data , 2017 .

[25]  Jonathan Mellon,et al.  Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users , 2017 .

[26]  Haiyi Zhang,et al.  Naïve Bayes Text Classifier , 2007 .

[27]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[28]  Margaret E. Roberts,et al.  From Liberation to Turmoil: Social Media And Democracy , 2017 .

[29]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[30]  B. Nyhan,et al.  Selective exposure to misinformation: Evidence from the consumption of fake news during the 2016 U.S. presidential campaign , 2018 .