Utilizing overtly political texts for fully automatic evaluation of political leaning of online news websites

Purpose – Reliability and political bias of mass media has been a controversial topic in the literature. The purpose of this paper is to propose and implement a methodology for fully automatic evaluation of the political tendency of the written media on the web, which does not rely on subjective human judgments. Design/methodology/approach – The underlying idea is to base the evaluation on fully automatic comparison of the texts of articles on different news websites to the overtly political texts with known political orientation. The authors also apply an alternative approach for evaluation of political tendency based on wisdom of the crowds. Findings – The authors found that the learnt classifier can accurately distinguish between self-declared left and right news sites. Furthermore, news sites’ political tendencies can be identified by automatic classifier learnt from manifestly political texts without recourse to any manually tagged data. The authors also show a high correlation between readers’ perce...

[1]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[2]  Daniel Gillick,et al.  Can conversational word usage be used to predict speaker demographics? , 2010, INTERSPEECH.

[3]  Kimberly M. Christopherson,et al.  Perceptions of Political Bias in the Headlines of Two Major News Organizations , 2007 .

[4]  K. Aisbett,et al.  Views on the News , 1989 .

[5]  M. Petrova Inequality and Media Capture , 2007 .

[6]  M. Allen,et al.  Media bias in presidential elections: a meta‐analysis , 2000 .

[7]  Gregory Grefenstette,et al.  Coupling Niche Browsers and Affect Analysis for an Opinion Mining Application , 2004, RIAO.

[8]  M. Petrova Newspapers and Parties: How Advertising Revenues Created an Independent Press , 2011, American Political Science Review.

[9]  Stefan Kaufmann,et al.  Classifying Party Affiliation from Political Speech , 2008 .

[10]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[11]  Cindy K. Chung,et al.  The Psychological Functions of Function Words , 2007 .

[12]  M. Laver,et al.  Extracting Policy Positions from Political Texts Using Words as Data , 2003, American Political Science Review.

[13]  Jesse M. Shapiro,et al.  Do Newspapers Serve the State? Incumbent Party Influence on the Us Press, 1869-1928 , 2012 .

[14]  Miles James Efron Cultural Orientation: Classifying Subjective Documents by Cociation Analysis , 2004, AAAI Technical Report.

[15]  Katja Filippova,et al.  User Demographics and Language in an Implicit Social Network , 2012, EMNLP.

[16]  Jahna Otterbacher,et al.  Inferring gender of movie reviewers: exploiting writing style, content and metadata , 2010, CIKM.

[17]  Robert L. Grossman,et al.  Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining , 2005, KDD 2005.

[18]  Shlomo Argamon,et al.  Automatically profiling the author of an anonymous text , 2009, CACM.

[19]  Wei-Hao Lin,et al.  Which Side are You on? Identifying Perspectives at the Document and Sentence Levels , 2006, CoNLL.

[20]  Sarah Steiner Gender, Genre, and Writing Style in Formal Written Texts , 2014 .

[21]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[22]  Kazi Saidul Predicting Stance in Ideological Debate with Rich Linguistic Knowledge , 2012 .

[23]  Tim Groseclose,et al.  A Measure of Media Bias , 2005 .

[24]  Beata Beigman Klebanov,et al.  Vocabulary Choice as an Indicator of Perspective , 2010, ACL.

[25]  J. Ladd The Role of Media Distrust in Partisan Voting , 2010 .

[26]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[27]  Samuel D. Gosling,et al.  Manifestations of Personality in Online Social Networks: Self-Reported Facebook-Related Behaviors and Observable Profile Information , 2011, Cyberpsychology Behav. Soc. Netw..

[28]  Vasileios Hatzivassiloglou,et al.  Automatic Detection of Tags for Political Blogs , 2010, HLT-NAACL 2010.

[29]  Rob Malouf,et al.  A Preliminary Investigation into Sentiment Analysis of Informal Political Discourse , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[30]  Shlomo Argamon,et al.  Effects of Age and Gender on Blogging , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[31]  Adrian Popescu,et al.  Mining User Home Location and Gender from Flickr Tags , 2010, ICWSM.

[32]  Moshe Koppel,et al.  Determining an author's native language by mining a text for errors , 2005, KDD '05.