论文信息 - Cross-Lingual Classification of Topics in Political Texts - 字舞流文

Cross-Lingual Classification of Topics in Political Texts

In this paper, we propose an approach for cross-lingual topical coding of sentences from electoral manifestos of political parties in different languages. To this end, we exploit continuous semantic text representations and induce a joint multilingual semantic vector spaces to enable supervised learning using manually-coded sentences across different languages. Our experimental results show that classifiers trained on multilingual data yield performance boosts over monolingual topic classification.

Goran Glavas | Simone Paolo Ponzetto | Federico Nanni | F. Nanni | G. Glavas

[1] Dustin Hillard,et al. Automated classification of congressional legislation , 2006, DG.O.

[2] Kenneth Benoit,et al. Coder Reliability and Misclassification in the Human Coding of Party Manifestos , 2012, Political Analysis.

[3] Goran Glavas,et al. Unsupervised Cross-Lingual Scaling of Political Texts , 2017, EACL.

[4] Simone Paolo Ponzetto,et al. Entities as topic labels : combining entity linking and labeled LDA to improve topic interpretability and evaluability , 2016 .

[5] Alessandro Moschitti,et al. Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[6] Simone Paolo Ponzetto,et al. TopFish: topic-based analysis of political position in US electoral campaigns , 2016 .

[7] A. Pentland,et al. Life in the network: The coming age of computational social science: Science , 2009 .

[8] Antal van den Bosch,et al. Automatic thematic classification of election manifestos , 2014, Inf. Process. Manag..

[9] Sara Tonelli,et al. Agreement and Disagreement: Comparison of Points of View in the Political Domain , 2016, COLING.

[10] Simone Paolo Ponzetto,et al. Building Entity-Centric Event Collections , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[11] Margaret E. Roberts,et al. Computer‐Assisted Keyword and Document Set Discovery from Unstructured Text , 2017 .

[12] Jan Snajder,et al. Analysis of Policy Agendas: Lessons Learned from Automatic Topic Classification of Croatian Political Texts , 2016, LaTeCH@ACL.

[13] Goran Glavas,et al. Unsupervised Text Segmentation Using Semantic Relatedness Graphs , 2016, *SEMEVAL.

[14] Samy Bengio,et al. The Handbook of Brain Theory and Neural Networks , 2002 .

[15] Justin Grimmer,et al. Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[16] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17] Noah A. Smith,et al. Measuring Ideological Proportions in Political Speeches , 2013, EMNLP.

[18] Sven-Oliver Proksch,et al. A Scaling Model for Estimating Time-Series Party Positions from Texts , 2007 .

[19] Michael A. Arbib,et al. The handbook of brain theory and neural networks , 1995, A Bradford book.

[20] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[21] Brandon M. Stewart,et al. Use of force and civil–military relations in Russia: an automated content analysis , 2009 .

[22] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[23] Heiner Stuckenschmidt,et al. Classifying topics and detecting topic shifts in political manifestos , 2016 .

[24] Slava J. Mikhaylov,et al. Scaling policy preferences from coded political texts , 2011 .

[25] Konstantinos Gemenis,et al. What to Do (and Not to Do) with the Comparative Manifestos Project Data , 2013 .

[26] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.