Grammar guided embedding based Chinese long text sentiment classification

Although the state‐of‐the‐art sentiment classification approaches, such as LSTM and TextCNN, have achieved a good performance on Chinese short text sentiment analysis, the Chinese long text sentiment classification is still a challenge because of the sentiment change problem and the long text structure problem. Therefore, we propose a grammar guided embedding model (GGE) and a novel Chinese long text sentiment classification framework. First, the part‐of‐speech (POS) tags are introduced as the Chinese long text grammar guided information which can help classification approaches to model the Chinese long text structure and the important structure of sentiment change. Second, we proposed a simple GGE training method which considers the combination representation of word sequence and POS sequence. Finally, we proposed a unified framework which combines our novel GGE with TextCNN. Experiment results show that after using GGE, the model outperforms the state‐of‐the‐art approaches. At the same time, we also found that the GGE achieves the model converge faster, that is, it can achieve better results than without GGE when there is only a small amount of training data. Thus, we believe that the GGE can help machines better understand human language sentiment expression structure.

[1]  Kathleen M. Carley,et al.  Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks , 2018, SBP-BRiMS.

[2]  Ido Dagan,et al.  Synthesis Lectures on Human Language Technologies , 2009 .

[3]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[4]  Kai Wang,et al.  Relational Graph Attention Network for Aspect-based Sentiment Analysis , 2020, ACL.

[5]  H. Robbins A Stochastic Approximation Method , 1951 .

[6]  Penghong Wang,et al.  Multi-Objective Three-Dimensional DV-Hop Localization Algorithm With NSGA-II , 2019, IEEE Sensors Journal.

[7]  Jiebo Luo,et al.  Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM , 2018, ACM Multimedia.

[8]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[9]  Jane Yung-jen Hsu,et al.  Building a Concept-Level Sentiment Dictionary Based on Commonsense Knowledge , 2013, IEEE Intelligent Systems.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Dongyan Zhao,et al.  Multi-grained Attention Network for Aspect-Level Sentiment Classification , 2018, EMNLP.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Penghong Wang,et al.  A Gaussian error correction multi‐objective positioning model with NSGA‐II , 2019, Concurr. Comput. Pract. Exp..

[17]  Tong Zhang,et al.  Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.

[18]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[19]  Siu Cheung Hui,et al.  Dyadic Memory Networks for Aspect-based Sentiment Analysis , 2017, CIKM.

[20]  Zhihua Cui,et al.  Personalized Recommendation System Based on Collaborative Filtering for IoT Scenarios , 2020, IEEE Transactions on Services Computing.

[21]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[22]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[23]  Jinjun Chen,et al.  Privacy preservation in blockchain based IoT systems: Integration issues, prospects, challenges, and future research directions , 2019, Future Gener. Comput. Syst..

[24]  Hwee Tou Ng,et al.  Effective Attention Modeling for Aspect-Level Sentiment Classification , 2018, COLING.

[25]  Xinyan Xiao,et al.  SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis , 2020, ACL.

[26]  Ming Yang,et al.  Bidirectional Long Short-Term Memory Networks for Relation Classification , 2015, PACLIC.

[27]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[28]  Zhihua Cui,et al.  Malicious Code Detection under 5G HetNets Based on a Multi-Objective RBM Model , 2021, IEEE Network.

[29]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[30]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Houfeng Wang,et al.  Interactive Attention Networks for Aspect-Level Sentiment Classification , 2017, IJCAI.

[33]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[34]  Zhihua Cui,et al.  Hybrid many-objective particle swarm optimization algorithm for green coal production problem , 2020, Inf. Sci..

[35]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[36]  Jinjun Chen,et al.  Differential Privacy Techniques for Cyber Physical Systems: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[37]  Jinjun Chen,et al.  DEAL: Differentially Private Auction for Blockchain-Based Microgrids Energy Trading , 2020, IEEE Transactions on Services Computing.

[38]  Jinjun Chen,et al.  A Multicloud-Model-Based Many-Objective Intelligent Algorithm for Efficient Task Scheduling in Internet of Things , 2021, IEEE Internet of Things Journal.

[39]  Ting Liu,et al.  Aspect Level Sentiment Classification with Deep Memory Network , 2016, EMNLP.

[40]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[41]  Zhihua Cui,et al.  A New Subspace Clustering Strategy for AI-Based Data Analysis in IoT System , 2021, IEEE Internet of Things Journal.

[42]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[43]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[44]  Zhihua Cui,et al.  A Hybrid BlockChain-Based Identity Authentication Scheme for Multi-WSN , 2020, IEEE Transactions on Services Computing.

[45]  Zhihua Cui,et al.  An under‐sampled software defect prediction method based on hybrid multi‐objective cuckoo search , 2019, Concurr. Comput. Pract. Exp..