论文信息 - Tom Jumbo-Grumbo at SemEval-2019 Task 4: Hyperpartisan News Detection with GloVe vectors and SVM

Tom Jumbo-Grumbo at SemEval-2019 Task 4: Hyperpartisan News Detection with GloVe vectors and SVM

In this paper, we describe our attempt to learn bias from news articles. From our experiments, it seems that although there is a correlation between publisher bias and article bias, it is challenging to learn bias directly from the publisher labels. On the other hand, using few manually-labeled samples can increase the accuracy metric from around 60% to near 80%. Our system is computationally inexpensive and uses several standard document representations in NLP to train an SVM or LR classifier. The system ranked 4th in the SemEval-2019 task. The code is released for reproducibility.

[1] Lei Zhang,et al. Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[2] Steven Skiena,et al. Multi-view Models for Political Ideology Detection of News Articles , 2018, EMNLP.

[3] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4] Christopher D. Manning,et al. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[5] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[7] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[8] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[9] Mattias Polborn,et al. Political Polarization and the Electoral Effects of Media Bias , 2006, SSRN Electronic Journal.

[10] Benno Stein,et al. A Stylometric Inquiry into Hyperpartisan and Fake News , 2017, ACL.

[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12] Janyce Wiebe,et al. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[13] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[14] Jure Leskovec,et al. node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[15] Daniel Jurafsky,et al. Linguistic Models for Analyzing and Detecting Biased Language , 2013, ACL.

[16] Gerhard Weikum,et al. Leveraging Joint Interactions for Credibility Analysis in News Communities , 2015, CIKM.

[17] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .