YNU-HPCC at SemEval-2020 Task 11: LSTM Network for Detection of Propaganda Techniques in News Articles

This paper summarizes our studies on propaganda detection techniques for news articles in the SemEval-2020 task 11. This task is divided into the SI and TC subtasks. We implemented the GloVe word representation, the BERT pretraining model, and the LSTM model architecture to accomplish this task. Our approach achieved good results for both the SI and TC subtasks. The macro-F1-score for the SI subtask is 0.406, and the micro-F1-score for the TC subtask is 0.505. Our method significantly outperforms the officially released baseline method, and the SI and TC subtasks rank 17th and 22nd, respectively, for the test set. This paper also compares the performances of different deep learning model architectures, such as the Bi-LSTM, LSTM, BERT, and XGBoost models, on the detection of news promotion techniques. The code of this paper is availabled at: this https URL.

[1]  Eibe Frank,et al.  Accelerating the XGBoost algorithm using GPU computing , 2017, PeerJ Comput. Sci..

[2]  Xiaobo Wu,et al.  A Method of Emotional Analysis of Movie Based on Convolution Neural Network and Bi-directional LSTM RNN , 2017, 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC).

[3]  Hongfei Lin,et al.  An attention‐based BiLSTM‐CRF approach to document‐level chemical named entity recognition , 2018, Bioinform..

[4]  Julie Medero,et al.  Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector , 2019, *SEMEVAL.

[5]  Grigorios Tsoumakas,et al.  Local word vectors guiding keyphrase extraction , 2018, Inf. Process. Manag..

[6]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Nazli Goharian,et al.  GU IRLAB at SemEval-2018 Task 7: Tree-LSTMs for Scientific Relation Classification , 2018, SemEval@NAACL-HLT.

[9]  David D. Lewis,et al.  Evaluating and optimizing autonomous text classification systems , 1995, SIGIR '95.

[10]  Giovanni Da San Martino,et al.  SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles , 2020, SEMEVAL.

[11]  Aurélien Lucchi,et al.  SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision , 2016, *SEMEVAL.

[12]  Chen Ping Hierarchical text classification and evaluation , 2010 .

[13]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[14]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Eneko Agirre,et al.  SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.

[18]  Zhihua Li,et al.  News Text Classification Based on Improved Bi-LSTM-CNN , 2018, 2018 9th International Conference on Information Technology in Medicine and Education (ITME).

[19]  Georgios Paraskevopoulos,et al.  NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs , 2018, *SEMEVAL.

[20]  Chunyan Miao,et al.  ntuer at SemEval-2019 Task 3: Emotion Classification with Word and Sentence Representations in RCNN , 2019, *SEMEVAL.

[21]  M. Rosenberg,et al.  Propaganda Techniques in Institutional Advertising , 1952 .

[22]  Venkatesh Duppada,et al.  SeerNet at SemEval-2018 Task 1: Domain Adaptation for Affect in Tweets , 2018, *SEMEVAL.

[23]  James Cross,et al.  Incremental Parsing with Minimal Features Using Bi-Directional LSTM , 2016, ACL.