论文信息 - Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus - 字舞流文

Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

The explosion of disinformation related to the COVID-19 pandemic has overloaded fact-checkers and media worldwide. To help tackle this, we developed computational methods to support COVID-19 disinformation debunking and social impacts research. This paper presents: 1) the currently largest available manually annotated COVID-19 disinformation category dataset; and 2) a classification-aware neural topic model (CANTM) that combines classification and topic modelling under a variational autoencoder framework. We demonstrate that CANTM efficiently improves classification performance with low resources, and is scalable. In addition, the classification-aware topics help researchers and end-users to better understand the classification results.

Kalina Bontcheva | Diana Maynard | Johann Petrak | Ye Jiang | Xingyi Song | Iknoor Singh | D. Maynard | Xingyi Song | Kalina Bontcheva | Johann Petrak | Ye Jiang | Iknoor Singh

[1] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[2] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[3] Jing Li,et al. Topic Memory Networks for Short Text Classification , 2018, EMNLP.

[4] Ramesh Nallapati,et al. Coherence-Aware Neural Topic Modeling , 2018, EMNLP.

[5] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[6] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[7] Andrew McCallum,et al. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[8] Charles A. Sutton,et al. Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[11] Noah A. Smith,et al. Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[12] Chong Wang,et al. Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[13] Andrew McCallum,et al. Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[14] Hanchen Xiong,et al. Discriminative Topic Modeling with Logistic LDA , 2019, NeurIPS.

[15] Jiafeng Guo,et al. BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.

[16] Phil Blunsom,et al. Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.

[17] Xiaolin Li,et al. GraphBTM: Graph Enhanced Autoencoded Variational Inference for Biterm Topic Model , 2018, EMNLP.

[18] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19] Noah A. Smith,et al. Neural Models for Documents with Metadata , 2017, ACL.

[20] Ramesh Nallapati,et al. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[21] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22] Eric P. Xing,et al. Sparse Additive Generative Models of Text , 2011, ICML.

[23] David M. Blei,et al. Supervised Topic Models , 2007, NIPS.

[24] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .

[25] Lorna Christie,et al. COVID-19 misinformation , 2020 .

[26] Phil Blunsom,et al. Neural Variational Inference for Text Processing , 2015, ICML.

[27] Thomas L. Griffiths,et al. The Author-Topic Model for Authors and Documents , 2004, UAI.

[28] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[29] Xiaohui Yan,et al. A biterm topic model for short texts , 2013, WWW.

[30] Timothy Baldwin,et al. Automatic Evaluation of Topic Coherence , 2010, NAACL.

[31] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[32] David G. Rand,et al. Structural Topic Models for Open‐Ended Survey Responses , 2014, American Journal of Political Science.

[33] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.