Structural information aware deep semi-supervised recurrent neural network for sentiment analysis

With the development of Internet, people are more likely to post and propagate opinions online. Sentiment analysis is then becoming an important challenge to understand the polarity beneath these comments. Currently a lot of approaches from natural language processing’s perspective have been employed to conduct this task. The widely used ones include bag-of-words and semantic oriented analysis methods. In this research, we further investigate the structural information among words, phrases and sentences within the comments to conduct the sentiment analysis. The idea is inspired by the fact that the structural information is playing important role in identifying the overall statement’s polarity. As a result a novel sentiment analysis model is proposed based on recurrent neural network, which takes the partial document as input and then the next parts to predict the sentiment label distribution rather than the next word. The proposed method learns words representation simultaneously the sentiment distribution. Experimental studies have been conducted on commonly used datasets and the results have shown its promising potential.

[1]  Yoshua Bengio,et al.  Neural Probabilistic Language Models , 2006 .

[2]  Rong Zheng,et al.  From fingerprint to writeprint , 2006, Commun. ACM.

[3]  Quoc V. Le,et al.  Recurrent Neural Networks for Noise Reduction in Robust ASR , 2012, INTERSPEECH.

[4]  Ronen Feldman,et al.  Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web , 2007, ACL.

[5]  Geoffrey Zweig,et al.  Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[6]  Geoffrey E. Hinton,et al.  Visualizing non-metric similarities in multiple maps , 2011, Machine Learning.

[7]  Marilyn A. Walker,et al.  Learning to Generate Naturalistic Utterances Using Reviews in Spoken Dialogue Systems , 2006, ACL.

[8]  Slav Petrov,et al.  Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models , 2010, EMNLP.

[9]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[10]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .

[11]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[12]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[13]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[14]  Michael Egmont-Petersen,et al.  Image processing with neural networks - a review , 2002, Pattern Recognit..

[15]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[16]  Claire Cardie,et al.  Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering , 2003, New Directions in Question Answering.

[17]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[18]  Kentaro Inui,et al.  Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables , 2010, NAACL.

[19]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[20]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[21]  Klaus Winkelmann Conference on Innovative Applications of Artificial Intelligence , 1989, Künstliche Intell..

[22]  David Yarowsky,et al.  Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams , 2013, ACL.

[23]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[24]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[25]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[26]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[27]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[28]  Yee Whye Teh,et al.  Rate-coded Restricted Boltzmann Machines for Face Recognition , 2000, NIPS.

[29]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[30]  Gholamreza Haffari,et al.  Transductive learning for statistical machine translation , 2007, ACL.

[31]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[32]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[33]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[34]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[35]  Christopher D. Manning,et al.  Exploring Sentiment Summarization , 2004 .

[36]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[37]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Ming Zhou,et al.  Sentence-Level Sentiment Analysis via Sequence Modeling , 2011, ICAIC.

[39]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[40]  Lukás Burget,et al.  Recurrent Neural Network Based Language Modeling in Meeting Recognition , 2011, INTERSPEECH.

[41]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[42]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[44]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[45]  Brian Kingsbury,et al.  Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[46]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[47]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[48]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[49]  Volkmar Frinken,et al.  A Novel Word Spotting Algorithm Using Bidirectional Long Short-Term Memory Neural Networks , 2010, ANNPR.

[50]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[51]  Sabine Bergler,et al.  Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses , 2006, EACL.

[52]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[53]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[54]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[55]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[56]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[57]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[59]  Zhi-Hua Zhou,et al.  Learning with Unlabeled Data and Its Application to Image Retrieval , 2006, PRICAI.

[60]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[61]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[62]  Carl Vogel,et al.  Proceedings of the 16th International Conference on Computational Linguistics , 1996, COLING 1996.

[63]  Jean-Luc Gauvain,et al.  Connectionist language modeling for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[64]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[65]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[66]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[67]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[68]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[69]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[70]  Lorenzo Bruzzone,et al.  A Novel Transductive SVM for Semisupervised Classification of Remote-Sensing Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[71]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[72]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[73]  Noah A. Smith,et al.  Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[74]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[75]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[76]  Soo-Min Kim,et al.  Automatic Identification of Pro and Con Reasons in Online Reviews , 2006, ACL.

[77]  Ilya Sutskever,et al.  Training Deep and Recurrent Networks with Hessian-Free Optimization , 2012, Neural Networks: Tricks of the Trade.

[78]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[79]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .