Machine Learning meets Data-Driven Journalism: Boosting International Understanding and Transparency in News Coverage

Migration crisis, climate change or tax havens: Global challenges need global solutions. But agreeing on a joint approach is difficult without a common ground for discussion. Public spheres are highly segmented because news are mainly produced and received on a national level. Gain- ing a global view on international debates about important issues is hindered by the enormous quantity of news and by language barriers. Media analysis usually focuses only on qualitative re- search. In this position statement, we argue that it is imperative to pool methods from machine learning, journalism studies and statistics to help bridging the segmented data of the international public sphere, using the Transatlantic Trade and Investment Partnership (TTIP) as a case study.

[1]  Robert M. Entman,et al.  Framing: Toward Clarification of a Fractured Paradigm , 1993 .

[2]  Christopher Ré,et al.  DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference , 2012, VLDS.

[3]  Alfred Hermida,et al.  Content Analysis in an Era of Big Data: A Hybrid Approach to Computational and Manual Methods , 2013 .

[4]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Christian Bauckhage,et al.  Collective attention to social media evolves according to diffusion models , 2014, WWW '14 Companion.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  W. Bennett,et al.  Toward a Theory of Press-State Relations in the United States , 1990 .

[10]  J. Habermas,et al.  The structural transformation of the public sphere : an inquiryinto a category of bourgeois society , 1991 .

[11]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[12]  Konstantin Vössing Transforming public opinion about European integration: Elite influence and its limits , 2015 .

[13]  Ricardo J. Wray,et al.  Validation of Database Search Terms for Content Analysis: The Case of Cancer News Coverage , 2006 .

[14]  Fabian Hadiji,et al.  Poisson Dependency Networks: Gradient Boosted Models for Multivariate Count Data , 2015, Machine Learning.

[15]  Dafna Shahaf,et al.  Trains of thought: generating information maps , 2012, WWW.

[16]  M. Castells The New Public Sphere: Global Civil Society, Communication Networks, and Global Governance , 2008 .

[17]  E. Jones,et al.  Failing Forward? The Euro Crisis and the Incomplete Nature of European Integration , 2016 .

[18]  A. Downs Up and Down with Ecology--The Issue Attention Cycle , 1972 .

[19]  Pradeep Ravikumar,et al.  Admixture of Poisson MRFs: A Topic Model with Word Dependencies , 2014, ICML.