Phonetic-enriched Text Representation for Chinese Sentiment Analysis with Reinforcement Learning

The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. We are the first to argue that these two important properties can play a major role in Chinese sentiment analysis. Particularly, we propose two effective features to encode phonetic information. Next, we develop a Disambiguate Intonation for Sentiment Analysis (DISA) network using a reinforcement network. It functions as disambiguating intonations for each Chinese character (pinyin). Thus, a precise phonetic representation of Chinese is learned. Furthermore, we also fuse phonetic features with textual and visual features in order to mimic the way humans read and understand Chinese text. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and outshines the state-of-the-art Chinese character level representations.

[1]  Erik Cambria,et al.  OntoSenticNet: A Commonsense Ontology for Sentiment Analysis , 2018, IEEE Intelligent Systems.

[2]  Erik Cambria,et al.  Fuzzy commonsense reasoning for multimodal sentiment analysis , 2019, Pattern Recognit. Lett..

[3]  Erik Cambria,et al.  A Review of Sentiment Analysis Research in Chinese Language , 2017, Cognitive Computation.

[4]  Erik Cambria,et al.  Genetic Programming for Domain Adaptation in Product Reviews , 2020, 2020 IEEE Congress on Evolutionary Computation (CEC).

[5]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[6]  Chao Liu,et al.  Radical Embedding: Delving Deeper to Chinese Radicals , 2015, ACL.

[7]  Raymond Chiong,et al.  Multilingual sentiment analysis: from formal to informal and scarce resource languages , 2016, Artificial Intelligence Review.

[8]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[9]  Chng Eng Siong,et al.  Modelling Public Sentiment in Twitter: Using Linguistic Patterns to Enhance Supervised Learning , 2015, CICLing.

[10]  Richard C. Anderson,et al.  Phonetic awareness: Knowledge of orthography–phonology relationships in the character acquisition of Chinese children. , 2000 .

[11]  Davide Anguita,et al.  Statistical Learning Theory and ELM for Big Social Data Analysis , 2016, IEEE Computational Intelligence Magazine.

[12]  Ziniu Wang,et al.  Chinese Text Classification Method Based on BERT Word Embedding , 2020 .

[13]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Erik Cambria,et al.  Word Polarity Disambiguation Using Bayesian Model and Opinion-Level Features , 2014, Cognitive Computation.

[16]  Wei Li,et al.  User reviews: Sentiment analysis using lexicon integrated two-channel CNN-LSTM​ family models , 2020, Appl. Soft Comput..

[17]  Yuanbo Guo,et al.  Sentiment Classification for Chinese Text Based on Interactive Multitask Learning , 2020, IEEE Access.

[18]  Janet Hui-wen Hsiao,et al.  Analysis of a Chinese Phonetic Compound Database: Implications for Orthographic Processing , 2006, Journal of psycholinguistic research.

[19]  Erik Cambria,et al.  Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling , 2018, Knowl. Based Syst..

[20]  Rada Mihalcea,et al.  What Men Say, What Women Hear: Finding Gender-Specific Meaning Shades , 2016, IEEE Intelligent Systems.

[21]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[22]  Erik Cambria,et al.  Augmenting End-to-End Dialogue Systems With Commonsense Knowledge , 2018, AAAI.

[23]  Sivaji Bandyopadhyay,et al.  Music Genre Classification: A Semi-supervised Approach , 2013, MCPR.

[24]  C. Hansen Chinese Ideographs and Western Ideas , 1993, The Journal of Asian Studies.

[25]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[26]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[27]  Erik Cambria,et al.  Natural language based financial forecasting: a survey , 2017, Artificial Intelligence Review.

[28]  Chiu-yu Tseng,et al.  An Acoustic phonetic study on Tones in Mandarin Chinese , 1981 .

[29]  Erik Cambria,et al.  Bridging Cognitive Models and Recommender Systems , 2020, Cognitive Computation.

[30]  Houfeng Wang,et al.  Interactive Attention Networks for Aspect-Level Sentiment Classification , 2017, IJCAI.

[31]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[32]  L. Katz,et al.  The reading process is different for different orthographies : the orthographic depth hypothesis , 1992 .

[33]  Erik Cambria,et al.  SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis , 2020, CIKM.

[34]  Asif Ekbal,et al.  How Intense Are You? Predicting Intensities of Emotions and Sentiments using Stacked Ensemble [Application Notes] , 2020, IEEE Comput. Intell. Mag..

[35]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[36]  Erik Cambria,et al.  Context-Dependent Sentiment Analysis in User-Generated Videos , 2017, ACL.

[37]  Jun Zhou,et al.  cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information , 2018, AAAI.

[38]  Francisco Herrera,et al.  Consensus vote models for detecting and filtering neutrality in sentiment analysis , 2018, Inf. Fusion.

[39]  L. Katz,et al.  Strategies for visual word recognition and orthographical depth: a multilingual comparison. , 1987, Journal of experimental psychology. Human perception and performance.

[40]  Erik Cambria,et al.  Semi-supervised learning for big social data analysis , 2018, Neurocomputing.

[41]  Erik Cambria,et al.  A survey on empathetic dialogue systems , 2020, Inf. Fusion.

[42]  Li Zhao,et al.  Learning Structured Representation for Text Classification via Reinforcement Learning , 2018, AAAI.

[43]  Nan Yang,et al.  Radical-Enhanced Chinese Character Embedding , 2014, ICONIP.

[44]  Erik Cambria,et al.  Radical-Based Hierarchical Embeddings for Chinese Sentiment Analysis at Sentence Level , 2017, FLAIRS.

[45]  Erik Cambria,et al.  Sentic LSTM: a Hybrid Network for Targeted Aspect-Based Sentiment Analysis , 2018, Cognitive Computation.

[46]  Jane Yung-jen Hsu,et al.  Sentic blending: Scalable multimodal fusion for the continuous interpretation of semantics and sentics , 2013, 2013 IEEE Symposium on Computational Intelligence for Human-like Intelligence (CIHLI).

[47]  Frederick Liu,et al.  Learning Character-level Compositionality with Visual Features , 2017, ACL.

[48]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[49]  Zhao Hai,et al.  Chinese Word Segmentation: A Decade Review , 2007 .

[50]  Shou-De Lin,et al.  Glyph2Vec: Learning Chinese Out-of-Vocabulary Word Embedding from Glyphs , 2020, ACL.

[51]  Erik Cambria,et al.  SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings , 2018, AAAI.

[52]  Francisco Herrera,et al.  Distinguishing between facts and opinions for sentiment analysis: Survey and challenges , 2018, Inf. Fusion.

[53]  Wanxiang Che,et al.  Sentence Compression for Aspect-Based Sentiment Analysis , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[54]  Zhiyuan Liu,et al.  Joint Learning of Character and Word Embeddings , 2015, IJCAI.

[55]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[56]  Yang Li,et al.  Learning multi-grained aspect target sequence for Chinese sentiment analysis , 2018, Knowl. Based Syst..

[57]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[58]  Amit P. Sheth,et al.  Challenges of Sentiment Analysis for Dynamic Events , 2017, IEEE Intelligent Systems.

[59]  Erik Cambria,et al.  Tensor Fusion Network for Multimodal Sentiment Analysis , 2017, EMNLP.

[60]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[61]  Heng-Li Yang,et al.  Using Chinese radical parts for sentiment analysis and domain-dependent seed set extraction , 2018, Comput. Speech Lang..

[62]  Erik Cambria,et al.  Multi-attention Recurrent Network for Human Communication Comprehension , 2018, AAAI.

[63]  Jingyu Wang,et al.  Learning chinese word embeddings from character structural information , 2020, Comput. Speech Lang..

[64]  Paolo Gastaldo,et al.  Bayesian network based extreme learning machine for subjectivity detection , 2017, J. Frankl. Inst..

[65]  Erik Cambria,et al.  Popularity prediction on vacation rental websites , 2020, Neurocomputing.

[66]  Erik Cambria,et al.  Adaptive two-stage feature selection for sentiment classification , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[67]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[68]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Wenjie Li,et al.  Component-Enhanced Chinese Character Embeddings , 2015, EMNLP.

[70]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[71]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[72]  Dipankar Das,et al.  A Practical Guide to Sentiment Analysis , 2017 .

[73]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[74]  Rui Li,et al.  Multi-Granularity Chinese Word Embedding , 2016, EMNLP.

[75]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[76]  Erik Cambria,et al.  Anaphora and Coreference Resolution: A Review , 2018, Inf. Fusion.

[77]  E. Cambria,et al.  Predicting political sentiments of voters from Twitter in multi-party contexts , 2020, Appl. Soft Comput..

[78]  Rada Mihalcea,et al.  DialogueRNN: An Attentive RNN for Emotion Detection in Conversations , 2018, AAAI.

[79]  Hung-yi Lee,et al.  Learning Chinese Word Representations From Glyphs Of Characters , 2017, EMNLP.

[80]  Quan Pan,et al.  Learning Word Representations for Sentiment Analysis , 2017, Cognitive Computation.

[81]  Haixun Wang,et al.  Guest Editorial: Big Social Data Analysis , 2014, Knowl. Based Syst..

[82]  Erik Cambria,et al.  The Four Dimensions of Social Network Analysis: An Overview of Research Methods, Applications, and Software Tools , 2020, Inf. Fusion.

[83]  Francisco Herrera,et al.  What do people think about this monument? Understanding negative reviews via deep learning, clustering and descriptive rules , 2020, J. Ambient Intell. Humaniz. Comput..

[84]  鄭 秋豫,et al.  An acoustic phonetic study on tones in Mandarin Chinese , 1990 .