论文信息 - The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents

The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents

This paper introduces a simple method for estimating cultural orientation, the affiliation of online entities in a polarized field of discourse. In particular, cocitation information is used to estimate the political orientation of hypertext documents. A type of cultural orientation, the political orientation of a document is the degree to which it participates in traditionally left- or right-wing beliefs. Estimating documents' political orientation is of interest for personalized information retrieval and recommender systems. In its application to politics, the method uses a simple probabilistic model to estimate the strength of association between a document and left- and right-wing communities. The model estimates the likelihood of cocitation between a document of interest and a small number of documents of known orientation. The model is tested on three sets of data, 695 partisan web documents, 162 political weblogs, and 72 non-partisan documents. Accuracy above 90% is obtained from the cocitation model, outperforming lexically based classifiers at statistically significant levels.

Miles Efron

[1] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[2] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3] Taher H. Haveliwala. Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[4] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[5] Vasileios Hatzivassiloglou,et al. Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[6] Trevor J. Hastie,et al. The Sentimental Factor: Improving Review Classification Via Human-Provided Information , 2004, ACL.

[7] David M. Pennock,et al. Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[8] A. Agresti,et al. Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[9] Ramakrishnan Srikant,et al. Mining newsgroups using networks arising from social behavior , 2003, WWW '03.

[10] Maximino Aldana-Gonzalez,et al. Linked: The New Science of Networks , 2003 .

[11] Peter D. Turney. Thumbs Up, Thumbs Down , 2013, Journal of Cell Science.