Analyzing mass media influence using natural language processing and time series analysis

A key question of collective social behavior is related to the influence of Mass Media on public opinion. Different approaches have been developed to address quantitatively this issue, ranging from field experiments to mathematical models. In this work we propose a combination of tools involving natural language processing and time series analysis. We compare selected features of mass media news articles with measurable manifestation of public opinion. We apply our analysis to news articles belonging to the 2016 U.S. presidential campaign. We compare variations in polls (as a proxy of public opinion) with changes in the connotation of the news (sentiment) or in the agenda (topics) of a selected group of media outlets. Our results suggest that the sentiment content by itself is not enough to understand the differences in polls, but the combination of topics coverage and sentiment content provides an useful insight of the context in which public opinion varies. The methodology employed in this work is far general and can be easily extended to other topics of interest.

[1]  Christopher Wlezien,et al.  It's (change in) the (future) economy, stupid: : economic indicators, the media, and public opinion , 2015 .

[2]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[3]  Soojong Kim,et al.  Does newspaper coverage influence or reflect public perceptions of the economy? , 2017 .

[4]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[5]  Yasufumi Shibanai,et al.  Effects of Global Information Feedback on Diversity , 2001 .

[6]  T. Besley,et al.  The Political Economy of Government Responsiveness: Theory and Evidence from India , 2000 .

[7]  C. Granger Investigating Causal Relations by Econometric Models and Cross-Spectral Methods , 1969 .

[8]  G. King,et al.  How the news media activate public expression and influence national agendas , 2017, Science.

[9]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[10]  Hongchul Lee,et al.  Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers , 2012, J. Assoc. Inf. Sci. Technol..

[11]  Larry Hatcher,et al.  JMP for Basic Univariate and Multivariate Statistics: A Step-by-step Guide , 2005 .

[12]  M. McCombs A Look at Agenda-setting: past, present and future , 2005 .

[13]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[14]  Claudio J. Tessone,et al.  A data-driven model for Mass Media influence in electoral context , 2019, ArXiv.

[15]  Robert H. Shumway,et al.  Time series analysis and its applications : with R examples , 2017 .

[16]  Claes H. de Vreese,et al.  Media Effects on Public Opinion About the Enlargement of the European Union , 2006 .

[17]  M. G. Cosenza,et al.  Local versus global interactions in nonequilibrium transitions: A model of social dynamics. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Yuriy Gorodnichenko,et al.  Social Media, Sentiment and Public Opinions: Evidence from #Brexit and #Uselection , 2018, European Economic Review.

[19]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[20]  B. Efron,et al.  Second thoughts on the bootstrap , 2003 .

[21]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[22]  Olessia Koltsova,et al.  Mapping the public agenda with topic modeling: The case of the Russian livejournal , 2013 .

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Taha Yasseri,et al.  Wikipedia traffic data and electoral prediction: towards theoretically informed models , 2016, EPJ Data Science.

[25]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[26]  Nirmalie Wiratunga,et al.  Contextual sentiment analysis for social media genres , 2016, Knowl. Based Syst..

[27]  Juliane A. Lischka What Follows What? Relations between Economic Indicators, Economic Expectations of the Public, and News on the General Economy and Unemployment in Germany, 2002-2011 , 2015 .

[28]  Martin Gerlach,et al.  A universal information theoretic approach to the identification of stopwords , 2019, Nature Machine Intelligence.

[29]  Dean S. Karlan,et al.  Does the Media Matter? A Field Experiment Measuring the Effect of Newspapers on Voting Behavior and Political Opinions , 2006 .

[30]  Craig Leonard Brians,et al.  Campaign Issue Knowledge and Salience: Comparing Reception from TV Commercials, TV News and Newspapers , 1996 .

[31]  Jan Snajder,et al.  Getting the Agenda Right: Measuring Media Agenda using Topic Models , 2015, TM@CIKM.

[32]  D. Shaw,et al.  Agenda setting function of mass media , 1972 .

[33]  Shannon E. Martin,et al.  The Intersection of Agenda-Setting, the Media Environment, and Election Campaign Laws , 2016, Journal of Information Policy.

[34]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[35]  Pablo Balenzuela,et al.  Quantifying time-dependent Media Agenda and public opinion by topic modeling , 2018, Physica A: Statistical Mechanics and its Applications.

[36]  Pablo Balenzuela,et al.  Setting the Agenda: Different strategies of a Mass Media in a model of cultural dissemination , 2015, ArXiv.