Combining Statistics-Based and CNN-Based Information for Sentence Classification

Sentence classification, serving as the foundation of the subsequent text-based processing, continues attracting researchers attentions. Recently, with the great success of deep learning, convolutional neural network (CNN), a kind of common architecture of deep learning, has been widely used to this filed and achieved excellent performance. However, most CNN-based studies focus on using complex architectures to extract more effective category information, requiring more time in training models. With the aim to get better performance with less time cost on classification, this paper proposes two simple and effective methods by fully combining information both extracted from statistics and CNN. The first method is S-SFCNN, which combines statistical features and CNN-based probabilistic features of classification to build feature vectors, and then the vectors are used to train the logistic regression classifiers. And the second method is C-SFCNN, which combines CNN-based features and statistics-based probabilistic features of classification to build feature vectors. In the two methods, the Naive Bayes log-count ratios are selected as the text statistical features and the single-layer and single channel CNN is used as our CNN architecture. The testing results executed on 7 tasks show that our methods can achieve better performance than many other complex CNN models with less time cost. In addition, we summarized the main factors influencing the performance of our methods though experiment.

[1]  Kentaro Inui,et al.  Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables , 2010, NAACL.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[4]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[5]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[6]  Luísa Coheur,et al.  From symbolic to sub-symbolic information in question classification , 2011, Artificial Intelligence Review.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[9]  Claire Cardie,et al.  Context-aware Learning for Sentence-level Sentiment Analysis with Posterior Regularization , 2014, ACL.

[10]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[11]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[12]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[13]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[14]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[15]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Christopher D. Manning,et al.  Fast dropout training , 2013, ICML.

[17]  Ye Zhang,et al.  A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification , 2015, IJCNLP.

[18]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[19]  Erik Cambria,et al.  Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis , 2015, EMNLP.

[20]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[21]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[22]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[23]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[24]  Wenpeng Yin,et al.  Multichannel Variable-Size Convolution for Sentence Classification , 2015, CoNLL.