Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis

Cyberbullying and hate speeches are common issues in online etiquette. To tackle this highly concerned problem, we propose a text classification model based on convolutional neural networks for the de facto verbal aggression dataset built in our previous work and observe significant improvement, thanks to the proposed 2D TF-IDF features instead of pre-trained methods. Experiments are conducted to demonstrate that the proposed system outperforms our previous methods and other existing methods. A case study of word vectors is carried out to address the difficulty in using pre-trained word vectors for our short-text classification task, demonstrating the necessities of introducing 2D TF-IDF features. Furthermore, we also conduct visual analysis on the convolutional and pooling layers of the convolutional neural networks trained.

[1]  安藤 寛,et al.  Cross-Validation , 1952, Encyclopedia of Machine Learning and Data Mining.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Ka-Chun Wong,et al.  Aggressivity Detection on Social Network Comments , 2017, ISMSI '17.

[4]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[7]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[8]  Pilsung Kang,et al.  Sentiment Classification with Word Attention based on Weakly Supervised Leaning , 2017 .

[9]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[10]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[11]  Petr Hájek,et al.  Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns , 2017, Neural Computing and Applications.

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[14]  Rob Malouf,et al.  A Preliminary Investigation into Sentiment Analysis of Informal Political Discourse , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[15]  Wen Long,et al.  Investor sentiment identification based on the universum SVM , 2018, Neural Computing and Applications.

[16]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[17]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[18]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[19]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[20]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[21]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[22]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[25]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[26]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[27]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[28]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[29]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[30]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[31]  Hua Xu,et al.  Weakness Finder: Find product weakness from Chinese reviews by using aspects based sentiment analysis , 2012, Expert Syst. Appl..

[32]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[33]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.